ragflow
ragflow copied to clipboard
[Feature Request]: Can we use a VLM to do document parser?
Is there an existing issue for the same feature request?
- [x] I have checked the existing issues.
Is your feature request related to a problem?
Unable to accurately parse charts or other documents.
Describe the feature you'd like
Can we use a multimodal large model, such as Qwen2.5-VL, to extract content from images, scanned PDFs, or charts embedded in DOC files? If there is an interface that can be configured, it would be very flexible.
Describe implementation you've considered
No response
Documentation, adoption, use case
Additional information
No response