docling icon indicating copy to clipboard operation
docling copied to clipboard

page image preprocessing steps, not able to reproduce results

Open mllife opened this issue 1 year ago • 1 comments

I am trying out your layout model "model_artifacts/layout/beehive_v0.0.5_pt/model.pt" but I am seeing completely different output. I see you have done the page recalling from 1.5x res back to original w, h in "get_page_image" method in pdf_backend. But I am still not getting the same output. Can you tell me what else I am missing?

mllife avatar Nov 26 '24 10:11 mllife

@mllife Are you using the test code provided with docling-ibm-models? It should demonstrate how to use it.

cau-git avatar Nov 26 '24 13:11 cau-git

@cau-git , yes, I ran ran a page image through the test code; and then the same page through docling pipeline as a pdf I am seeing different results. I see the perfect output with the docling pipeline but not the same with test code, so I looked into the code that it was using image rescaling, still I am not getting the same output. Am I missing something?

mllife avatar Nov 27 '24 04:11 mllife