DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis Paper • 2206.01062 • Published Jun 2, 2022 • 1
AAAR-1.0: Assessing AI's Potential to Assist Research Paper • 2410.22394 • Published Oct 29, 2024 • 16
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 208
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 193
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22, 2024 • 77
Matryoshka Embedding Models Collection https://huggingface.co/blog/matryoshka • 14 items • Updated about 1 month ago • 16
Efficient Estimation of Word Representations in Vector Space Paper • 1301.3781 • Published Jan 16, 2013 • 6
Unifying Large Language Models and Knowledge Graphs: A Roadmap Paper • 2306.08302 • Published Jun 14, 2023 • 3
Research Lessons Collection understanding important lessons from machine learning research • 2 items • Updated Oct 21, 2024 • 2
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing Paper • 1808.06226 • Published Aug 19, 2018 • 1
LLaMA: Open and Efficient Foundation Language Models Paper • 2302.13971 • Published Feb 27, 2023 • 14
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 77