Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding Paper • 2310.05424 • Published Oct 9, 2023 • 1
DistiLLM: Towards Streamlined Distillation for Large Language Models Paper • 2402.03898 • Published Feb 6, 2024 • 1
HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning Paper • 2409.09085 • Published Sep 11, 2024
SeRA: Self-Reviewing and Alignment of Large Language Models using Implicit Reward Margins Paper • 2410.09362 • Published Oct 12, 2024
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published 4 days ago • 27
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published 4 days ago • 27
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery Paper • 2310.18356 • Published Oct 24, 2023 • 24