Christoph Auer
Christoph Auer
This was addressed in the meantime. Please re-open if you find this still broken.
@sallahbaksh I tried converting the sample document, and it indeed takes a long time. However, it comes out fine in the end. The reason I suspect is that the high...
Potential duplicate of #671
@xfab-hari Docling offers multiple choices for OCR, among which are EasyOCR, RapidOCR, Tesseract, and ocrmac. The results of the docling conversion will depend how these engines can handle handwriting recognition.
More precisely, the DPI value is hardcoded individually per OCR backend, since every OCR backend has its own expectations regarding resolution.
@harinisri2001 Closing this issue, as it appears there is no further follow-up required. Please re-open if you have further input.
All mended now.
@kurekj Could you please check this again with docling 2.14.0 and report if you see this still? A lot of things changed in the layout processing since.
@janschachtschabel can you confirm this problem still with docling==2.17.0 on Windows? It would be great for us to understand.
@MatheusAbdias Thanks for your suggestion. This comes back to https://github.com/DS4SD/docling/issues/802, where we track this issue more broadly. We want to avoid working with `libmagic` since it is under GPL-license, hence...