HF中国镜像站

new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Nov 8

Submitted by

zhangysk

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

·
19 authors

6

Submitted by

hongyuw

BitNet a4.8: 4-bit Activations for 1-bit LLMs

·
3 authors

6

Submitted by

VictoriaLinML

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

·
11 authors

2

Submitted by

wenqsun

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

·
7 authors

4

Submitted by

j-min

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

·
5 authors

4

Submitted by

davidchan

Analyzing The Language of Visual Tokens

·
6 authors

Submitted by

passing2961

Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

·
5 authors

3

Submitted by

jonathan-roberts1

Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?

·
3 authors

3

Submitted by

shehan97

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

·
7 authors

Submitted by

Lmxyy

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

·
10 authors

3

Submitted by

notmahi

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

·
7 authors

2

Submitted by

scofield7419

RetrieveGPT: Merging Prompts and Mathematical Models for Enhanced Code-Mixed Information Retrieval

·
2 authors

Submitted by

ChuhanLi

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

·
6 authors

2

Submitted by

kmcode

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

·
6 authors

4

Submitted by

He-Yen

GazeGen: Gaze-Driven User Interaction for Visual Content Generation

·
8 authors

Submitted by

ShuhongZheng

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

·
5 authors

2