docling icon indicating copy to clipboard operation
docling copied to clipboard

Option to disable image extraction/merging

Open jerbob92 opened this issue 1 year ago • 4 comments

I'm playing around with the library, and having some pretty nice results with it! However, in my tests it often recognizes large pieces as an image, and then doesn't return the text and/or tables within that image (probably because it merges it into the biggest found layout item that is detected).

I think it would be a nice option to be able to disable the image extraction/merging.

jerbob92 avatar Nov 05 '24 08:11 jerbob92

@cau-git Hello can you please assign this to me?

BelaidCH avatar Dec 06 '24 15:12 BelaidCH

@jerbob92 @BelaidCH we have a version in the works that will enable to get the in-picture content out, it will be released by end of next week. I will post instructions how to test it if you want to help with that.

cau-git avatar Dec 07 '24 09:12 cau-git

Sure! Would be great to test it out!

jerbob92 avatar Dec 09 '24 19:12 jerbob92

Yes it would be great!

BelaidCH avatar Dec 10 '24 10:12 BelaidCH

This has since been implemented and is ready to use.

cau-git avatar May 23 '25 08:05 cau-git