[Feature Request]: Use text context for image enhancement
Self Checks
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (Language Policy).
- [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
- [x] Please do not modify this template :) and fill in all the required fields.
Is your feature request related to a problem?
Currently image enhancement using image2text model uses only image to create a text chunk.
Describe the feature you'd like
Please allow an option to include surrounding text context. This way image2text vlm model has better chance to describe given image. Lets say I have a pdf document with paragraph and figure images related to each other. It would be useful to include related paragraph(s) when trying to describe the image. Rather than using only image itself.
Describe implementation you've considered
No response
Documentation, adoption, use case
Additional information
No response
Hi @ahmedavid , really appreciate your input! Could you share any bottleneck of current implementation you've met?
Hi @ahmedavid , really appreciate your input! Could you share any bottleneck of current implementation you've met?
Hello, I am asking if possible to make image enhancements to take account surrounding text context. It is not a bug. I am asking for new feature.
Thanks.
Got it. Will do some investigation.
We've implemented a change to add surrounding text context to image and table chunks. https://github.com/infiniflow/ragflow/pull/11547 This means that even if the image's own description is not entirely accurate, the image can still be retrieved by matching the surrounding context inside the chunk. Does this meet your needs?