Markdown with embedded coordinates
Question
We want to use docling to extract text from pdf and feed that text to a llm in order to extract certain information. We already have such a system in place that using the export_to_markdown method provided by docling to maintain structure.
Our next phas requirement is to get coordinates as well alongwith structured text (sort of like a markdown with embedded coordinates) and feed that to the llm so that the llm can provide us coordinates for each piece of extracted information.
Is there a way to accompplish this just using the docling library and its config options?
Hi @brownsloth! I'm Dosu and I’m helping the docling team.
Docling does not support exporting markdown with embedded coordinates for text elements via config options or built-in methods. The export_to_markdown method preserves structure but drops positional metadata like bounding boxes or page coordinates—this info is only available in the native DoclingDocument format, not in markdown exports source.
If you need coordinates with your text, you can iterate over TextItem objects in the DoclingDocument and extract both text and coordinates programmatically, then post-process into a custom markdown or other format for your LLM source. There’s no config switch or export mode that adds coordinates to markdown output directly.
If you want a code outline for extracting text and coordinates, let me know!
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
@brownsloth I had previously manged to map sentence level provenance in https://github.com/google/langextract/issues/184#issue-3349438594 ( check pdf_extract.py )