ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Feature Request]: Use text context for image enhancement

Open ahmedavid opened this issue 5 months ago • 4 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

Is your feature request related to a problem?

Currently image enhancement using image2text model uses only image to create a text chunk.

Describe the feature you'd like

Please allow an option to include surrounding text context. This way image2text vlm model has better chance to describe given image. Lets say I have a pdf document with paragraph and figure images related to each other. It would be useful to include related paragraph(s) when trying to describe the image. Rather than using only image itself.

Describe implementation you've considered

No response

Documentation, adoption, use case


Additional information

No response

ahmedavid avatar Jul 29 '25 09:07 ahmedavid

Hi @ahmedavid , really appreciate your input! Could you share any bottleneck of current implementation you've met?

ZhenhangTung avatar Jul 29 '25 12:07 ZhenhangTung

Hi @ahmedavid , really appreciate your input! Could you share any bottleneck of current implementation you've met?

Hello, I am asking if possible to make image enhancements to take account surrounding text context. It is not a bug. I am asking for new feature.

Thanks.

ahmedavid avatar Aug 02 '25 09:08 ahmedavid

Got it. Will do some investigation.

ZhenhangTung avatar Aug 05 '25 09:08 ZhenhangTung

We've implemented a change to add surrounding text context to image and table chunks. https://github.com/infiniflow/ragflow/pull/11547 This means that even if the image's own description is not entirely accurate, the image can still be retrieved by matching the surrounding context inside the chunk. Does this meet your needs?

redredrrred avatar Nov 27 '25 08:11 redredrrred