Christoph Auer
Christoph Auer
Maybe one more check to do is, if the input format is an image, OCR _must_ be activated for its pipeline options (independent of the global OCR choice)
It appears this is solved with more recent transformers / torch versions. Please feel free to re-open if you still see the issue.
@vodkar thanks for being open to contribute. We can discuss here. Do you have any ideas where you would start from yourself so far?
@poojitha0892 This is unfortunately a known issue with our layout model. We are working on addressing this. See also: https://github.com/DS4SD/docling/issues/308
Closing this since #308 will track this topic.
Meanwhile we have added the possibility to represent these styles in `DoclingDocument` if the input format contains that information. The serializers should respect it.
@Swaymaw Thanks for the configuration options enhancements, this is matching what I had in mind. However, to better align with an in-development global configuration system in docling (see [here](https://github.com/DS4SD/docling/discussions/373)) without...
@kime541200 @ezscode We are actively working to support this case, a new release will bring this capability soon.
@sunil448832 we are not looking into in-process parallelization yet, because there is nothing to gain from it. This was what we took from early experiments. Everything computed in docling is...
@vitaly-d Since docling 2.17.0 we have code and equation transcription, which is limited to "display equations". We can not yet detect inline equations as parts of paragraphs.