Update README.md
Browse files
README.md
CHANGED
@@ -148,8 +148,8 @@ print(f"Total time: {time.time() - start_time:.2f} sec")
|
|
148 |
## DocTags
|
149 |
|
150 |
<img src="https://huggingface.co/ds4sd/SmolDocling-256M-preview/resolve/main/assets/doctags_v2.png" width="800" height="auto" alt="Image description">
|
151 |
-
|
152 |
-
|
153 |
|
154 |
## Supported Instructions
|
155 |
| Instruction | Description |
|
|
|
148 |
## DocTags
|
149 |
|
150 |
<img src="https://huggingface.co/ds4sd/SmolDocling-256M-preview/resolve/main/assets/doctags_v2.png" width="800" height="auto" alt="Image description">
|
151 |
+
DocTags create a clear and structured system of tags and rules that separate text from the document's structure. This makes things easier for Image-to-Sequence models by reducing confusion. On the other hand, converting directly to formats like HTML or Markdown can be messy—it often loses details, doesn’t clearly show the document’s layout, and increases the number of tokens, making processing less efficient.
|
152 |
+
DocTags are integrated with Docling, which allows export to HTML, Markdown, and JSON. These exports can be offloaded to the CPU, reducing token generation overhead and improving efficiency.
|
153 |
|
154 |
## Supported Instructions
|
155 |
| Instruction | Description |
|