Christoph Auer
Christoph Auer
This should be working as suggested by @werruww. Please reopen if you still see issues.
@Arslan-Mehmood1 Some PDFs simply have garbled text layers like these, with no rescue. Some strategies that could help: 1. Check what you get when using our docling-parse-v2 or our pypdfium...
@divekarsc Could you please outline what code you execute and how you measure to have isolated timings for TableFormer?
@sdspieg Great to see your investigation with docling and GROBID. Let me answer to a few points of you first: > 1. Is My Setup Fair and Correct? I was...
@itsainii That looks interesting! Could you please make a pull request with the code changes? Many thanks.
@jonaskahn I re-checked this, and I can see that many of the predicted text cells in EasyOCR come out with very low confidence. Can you please give a minimal code...
Closing this because of inactivity. Please feel free to reopen if there is further demand.
@Ouassim-Hamdani thanks for reporting and checking the copilot PR. I also don't believe it solved the problem and will check up myself now.
@Ouassim-Hamdani I think there might be a different problem here. I was checking with a large document (https://api.printnode.com/static/test/pdf/a4_500_pages.pdf) and I get the correct pages out (range 30,35), however in a...
@Ouassim-Hamdani the actual fix this needs is here: https://github.com/docling-project/docling-ibm-models/pull/141 It is hard to understand how this bug did not show effects anywhere earlier...