Submitted by akhaliq 55 Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models · 17 authors 5
Submitted by akhaliq 51 Beyond Language Models: Byte Models are Digital World Simulators · 6 authors 4
Submitted by akhaliq 34 Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers · 11 authors 3
Submitted by akhaliq 25 MOSAIC: A Modular System for Assistive and Interactive Cooking · 17 authors 1
Submitted by akhaliq 22 DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models · 10 authors 1
Submitted by akhaliq 20 Simple linear attention language models balance the recall-throughput tradeoff · 9 authors 12
Submitted by akhaliq 15 ViewFusion: Towards Multi-View Consistency via Interpolated Denoising · 6 authors 1