docling icon indicating copy to clipboard operation
docling copied to clipboard

Is it possible for convert() to return generators?

Open ofermend opened this issue 8 months ago • 2 comments
trafficstars

Question

I have a pretty large PDF document that I'm trying to convert, and often run OOM. Clearly I can increase memory, but that's not as scalable. is it possible to return text, images and tables in the Document as generators so that processing can be done on-demand instead of saving everything in memory?

ofermend avatar Feb 25 '25 03:02 ofermend

At the moment convert() returns generators on the number of documents in the input argument, but not within those.

dolfim-ibm avatar Feb 25 '25 07:02 dolfim-ibm

Would be great to add this within a document for text, images and tables Thank you!

ofermend avatar Feb 25 '25 13:02 ofermend