OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models Paper • 2307.03084 • Published Jul 5, 2023 • 1
OpenPrompt: An Open-source Framework for Prompt-learning Paper • 2111.01998 • Published Nov 3, 2021 • 1
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting Paper • 2402.13720 • Published Feb 21, 2024 • 7
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting Paper • 2402.13720 • Published Feb 21, 2024 • 7
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Paper • 2404.06395 • Published Apr 9, 2024 • 22
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads Paper • 2410.01805 • Published Oct 2, 2024
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling Paper • 2502.14856 • Published 22 days ago • 7
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling Paper • 2502.14856 • Published 22 days ago • 7
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs Paper • 2502.12085 • Published 25 days ago • 2
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs Paper • 2502.12085 • Published 25 days ago • 2
MiniCPM RAG Suite Collection Embedding, re-ranking, generation -- the cornerstone of RAG. • 6 items • Updated 11 days ago • 12