Christoph Auer

Results 170 comments of Christoph Auer

@mattmalcher If you can provide us with failing tests that would be very helpful for checking, thanks.

@titouv we could handle such things downstream through new enrichment models, which work on the `DoclingDocument` once the native Word backend has produced it. So, we have the framework to...

@titouv would you be interested to implement an enrichment model to perform OCR (which will cover word documents then)? We have the necessary framework in place, an example for a...

Closing this since the thread has been inactive for a while. Please feel free to reopen if there is interest in it.

@Fogapod I agree we should safeguard the docling code against this edge case. Still, this appears to be a 1x1 pixel PNG. Do you see this problem also on a...

@Zhengyu-Ju I am closing this since we have no further feedback to iterate on. Please re-open if you have any further feedback.

@jbdebard The ref pulled by `docling` in the current version is `v2.0.1` for this reason. If you want to pull the right version of the HF docling-models for current docling,...

@jbdebard the safetensors model versions are used in docling since 2.12.0. Closing this issue as completed.

@JabSYsEmb Thanks for reporting. We are aware that RTL is currently not handled by Docling and it is on our Roadmap to support it.

@Alla-Abdella Can you please re-check this after adding `options.force_full_page_ocr = True`? We need to be sure it is actually using OCR and not preferring content encoded in the PDF.