HF中国镜像站

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.08606

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published 22 days ago • 66
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 24 days ago • 28
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Paper • 2502.12574 • Published 24 days ago • 11
Large Language Diffusion Models

Paper • 2502.09992 • Published 28 days ago • 103

about 6 hours ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 21 days ago • 45
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published 23 days ago • 28
Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published 28 days ago • 17
Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46

Large Language Diffusion Models

Paper • 2502.09992 • Published 28 days ago • 103
Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46
Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22 • 42
Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 28

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 106
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 44
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 29
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 19

Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published 28 days ago • 33
Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 29 days ago • 143

Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46

paper maybe useful

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Paper • 2502.08590 • Published 29 days ago • 40
Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46
Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published 23 days ago • 77

interesting papers

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 124
Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 22
Distillation Scaling Laws

Paper • 2502.08606 • Published 29 days ago • 46
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published 29 days ago • 28

2025 LLM Papers on HF中国镜像站 with Japanese Memos

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published Jan 6 • 40
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 100
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 84
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published Jan 16 • 27

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs