docling
docling copied to clipboard
Markdown header levels are all the same
Bug
All headers in the generated markdown files are at the same level, i.e. ## <Header>.
For example, in the sample PDF to markdown conversion, section 3 has multiple sub-sections:
3 Processing pipeline
....
3.1 PDF backends
...
3.2 AI models
The correct header levels really should be something like this:
3 Processing pipeline
....
3.1 PDF backends
...
3.2 AI models
...
It would be great if the header level can be fixed, because this can keep the hierarchical structure of the original article.
Steps to reproduce
Just run the sample code of this repo. ...
Docling version
...
Python version
...
Yes, it would be great to have proper hierarchical headers!
dupe of #287?