docling icon indicating copy to clipboard operation
docling copied to clipboard

[Question] Figures and captions

Open austinmw opened this issue 1 year ago • 2 comments

Question

Hi, is this library also able to use multi-modal LLMs to interpret charts and figures within PDF documents?

austinmw avatar Dec 03 '24 15:12 austinmw

Hello @austinmw,

Thank you for your question. Currently, Docling does not offer this feature. However, our team is actively working on introducing an image classifier model first, and then a multi-modal LLM that can convert charts into structured formats like JSON, CSV, and Markdown. Stay tuned!

Matteo-Omenetti avatar Dec 09 '24 09:12 Matteo-Omenetti

@Matteo-Omenetti keep us updated on this!

simjak avatar Dec 09 '24 17:12 simjak

I'm parsing some documents after months of this discussion. For some reason, my docling Document identify many captions and label them as captions, but it doesnt associate with the images that are right above it. Anyone having this issue?

rafaelghiorzi avatar Mar 19 '25 14:03 rafaelghiorzi