ds4sd
/

SmolDocling-256M-preview

@@ -1,37 +1,55 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses

 ---
 library_name: transformers
+license: apache-2.0
+language:
+- en
+base_model:
+- HuggingFaceTB/SmolVLM-256M-Instruct
+pipeline_tag: image-text-to-text
 ---
+### SmolDocling-256M-preview
+SmolDocling is a multimodal Image-Text-to-Text model that features
 ## Model Details
 ### Model Description
+### SmolDocling-256M-preview
+SmolDocling is a multimodal Image-Text-to-Text model designed for efficient document conversion. It retains Docling's most popular features while ensuring full compatibility with Docling through seamless support for **DoclingDocuments**.
+### 🚀 Features:
+- 🏷️ **DocTags for Efficient Tokenization** – Introduces DocTags an efficient and minimal representation for documents that is fully compatible with **DoclingDocuments**.
+- 🔍 **OCR (Optical Character Recognition)** – Extracts text accurately from images.
+- 📐 **Layout and Localization** – Preserves document structure and document element **bounding boxes**.
+- 💻 **Code Recognition** – Detects and formats code blocks including identation.
+- 🔢 **Formula Recognition** – Identifies and processes mathematical expressions.
+- 📊 **Chart Recognition** – Extracts and interprets chart data.
+- 📑 **Table Recognition** – Supports column and row headers for structured table extraction.
+- 🖼️ **Figure Classification** – Differentiates figures and graphical elements.
+- 📝 **Caption Correspondence** – Links captions to relevant images and figures.
+- 📜 **List Grouping** – Organizes and structures list elements correctly.
+- 📄 **Full-Page Conversion** – Processes entire pages for comprehensive document transformation.
+- 📂 **General Document Processing** – Optimized for non-scientific documents.
+- 🔄 **Seamless Docling Integration** – Import into **Docling** and export in multiple formats.
+- 📚 **Multi-Page & Full Document Conversion** – *Coming soon!* 🚧
+**Repository:** [More Information Needed]
+**Paper [optional]:** [More Information Needed]
+**Demo [optional]:** [More Information Needed]
+## Model Summary
+- **Developed by:** Docling Team
+- **Model type:** Multi-modal model (image+text)
+- **Language(s) (NLP):** English
+- **License:** Apache 2.0
+- **Finetuned from model:** Based on [Idefics3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) (see technical summary)
 ## Uses