-
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 73 -
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model
Paper • 2503.07703 • Published • 29 -
Gemini Embedding: Generalizable Embeddings from Gemini
Paper • 2503.07891 • Published • 24 -
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Paper • 2503.07572 • Published • 31
Collections
Discover the best community collections!
Collections including paper arxiv:2503.07572
-
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 87 -
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
Paper • 2503.04724 • Published • 60 -
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 73 -
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Paper • 2503.07572 • Published • 31
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 138 -
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3 -
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Paper • 2408.16293 • Published • 26 -
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models
Paper • 2409.04787 • Published • 1
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 42 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 57
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69