[Question] Figures and captions
Question
Hi, is this library also able to use multi-modal LLMs to interpret charts and figures within PDF documents?
Hello @austinmw,
Thank you for your question. Currently, Docling does not offer this feature. However, our team is actively working on introducing an image classifier model first, and then a multi-modal LLM that can convert charts into structured formats like JSON, CSV, and Markdown. Stay tuned!
@Matteo-Omenetti keep us updated on this!
I'm parsing some documents after months of this discussion. For some reason, my docling Document identify many captions and label them as captions, but it doesnt associate with the images that are right above it. Anyone having this issue?