-
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 108 -
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Paper • 2403.09977 • Published • 11 -
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series
Paper • 2403.15360 • Published • 13
Ceshine Lee PRO
ceshine
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
7 days ago
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
liked
a model
10 days ago
microsoft/Phi-4-multimodal-instruct
liked
a model
11 days ago
perplexity-ai/r1-1776
Organizations
Collections
1
spaces
1
models
3
datasets
None public yet