view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 3 days ago • 233
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 8 days ago • 60
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 8 days ago • 79
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 11 days ago • 72
Rank1: Test-Time Compute for Reranking in Information Retrieval Paper • 2502.18418 • Published 17 days ago • 25
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 41
ReMamba: Equip Mamba with Effective Long-Sequence Modeling Paper • 2408.15496 • Published Aug 28, 2024 • 12
Foundation AI Papers Collection Curated List of Must-Reads on LLM reasoning at Temus AI team • 135 items • Updated Jun 15, 2024 • 32
Beyond KV Caching: Shared Attention for Efficient LLMs Paper • 2407.12866 • Published Jul 13, 2024 • 1
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models Paper • 2405.14831 • Published May 23, 2024 • 4
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts Paper • 2402.09727 • Published Feb 15, 2024 • 38
Farewell to Length Extrapolation, a Training-Free Infinite Context with Finite Attention Scope Paper • 2407.15176 • Published Jul 21, 2024 • 1
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31, 2024 • 39
Neurocache: Efficient Vector Retrieval for Long-range Language Modeling Paper • 2407.02486 • Published Jul 2, 2024 • 1
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Paper • 2401.01325 • Published Jan 2, 2024 • 27
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey Paper • 2311.12351 • Published Nov 21, 2023 • 4
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4, 2024 • 64
Scavenging Hyena: Distilling Transformers into Long Convolution Models Paper • 2401.17574 • Published Jan 31, 2024 • 17