Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 1 day ago • 41
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs Paper • 2503.02003 • Published 11 days ago • 42
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 4 days ago • 31
👩💻 OlympicCoder Collection Reasoning datasets and models for competitive coding • 4 items • Updated 3 days ago • 8
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 2 days ago • 232
Gemma 3 Collection All versions of Google's new multimodal models in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 29 items • Updated about 2 hours ago • 30
LettuceDetect: A Hallucination Detection Framework for RAG Applications Paper • 2502.17125 • Published 18 days ago • 8
view article Article LettuceDetect: A Hallucination Detection Framework for RAG Applications By adaamko • 14 days ago • 7
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 7 days ago • 30
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning Paper • 2503.04697 • Published 8 days ago • 2
view article Article Introducing EuroBERT: A High-Performance Multilingual Encoder Model By EuroBERT and 3 others • 4 days ago • 117
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published 7 days ago • 25
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 25 days ago • 43
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published 11 days ago • 24
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published 11 days ago • 31