asnassar commited on
Commit
3d3bb6a
·
verified ·
1 Parent(s): bd6e6ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -18
README.md CHANGED
@@ -1,37 +1,55 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
11
 
12
  ## Model Details
13
 
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
 
 
 
 
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
 
1
  ---
2
  library_name: transformers
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
+ base_model:
7
+ - HuggingFaceTB/SmolVLM-256M-Instruct
8
+ pipeline_tag: image-text-to-text
9
  ---
10
 
11
+ ### SmolDocling-256M-preview
 
 
12
 
13
+ SmolDocling is a multimodal Image-Text-to-Text model that features
14
 
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
 
20
+ ### SmolDocling-256M-preview
21
+
22
+ SmolDocling is a multimodal Image-Text-to-Text model designed for efficient document conversion. It retains Docling's most popular features while ensuring full compatibility with Docling through seamless support for **DoclingDocuments**.
23
+
24
+ ### 🚀 Features:
25
+ - 🏷️ **DocTags for Efficient Tokenization** – Introduces DocTags an efficient and minimal representation for documents that is fully compatible with **DoclingDocuments**.
26
+ - 🔍 **OCR (Optical Character Recognition)** – Extracts text accurately from images.
27
+ - 📐 **Layout and Localization** – Preserves document structure and document element **bounding boxes**.
28
+ - 💻 **Code Recognition** – Detects and formats code blocks including identation.
29
+ - 🔢 **Formula Recognition** – Identifies and processes mathematical expressions.
30
+ - 📊 **Chart Recognition** – Extracts and interprets chart data.
31
+ - 📑 **Table Recognition** – Supports column and row headers for structured table extraction.
32
+ - 🖼️ **Figure Classification** – Differentiates figures and graphical elements.
33
+ - 📝 **Caption Correspondence** – Links captions to relevant images and figures.
34
+ - 📜 **List Grouping** – Organizes and structures list elements correctly.
35
+ - 📄 **Full-Page Conversion** – Processes entire pages for comprehensive document transformation.
36
+ - 📂 **General Document Processing** – Optimized for non-scientific documents.
37
+ - 🔄 **Seamless Docling Integration** – Import into **Docling** and export in multiple formats.
38
+ - 📚 **Multi-Page & Full Document Conversion** – *Coming soon!* 🚧
39
 
40
+ **Repository:** [More Information Needed]
41
+ **Paper [optional]:** [More Information Needed]
42
+ **Demo [optional]:** [More Information Needed]
43
 
 
 
 
 
 
 
 
44
 
45
+ ## Model Summary
46
 
47
+ - **Developed by:** Docling Team
48
+ - **Model type:** Multi-modal model (image+text)
49
+ - **Language(s) (NLP):** English
50
+ - **License:** Apache 2.0
51
+ - **Finetuned from model:** Based on [Idefics3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) (see technical summary)
52
 
 
 
 
53
 
54
  ## Uses
55