ragflow
ragflow copied to clipboard
[Question]: Multimodel embedding retrieve for image in docs
Describe your problem
Is there any support for image retrieve in doc/xls etc. ?
-
extract images in xls automatically
-
encode image as embedding, and retrieve using multimodal embedding model, such as LLM2CLIP(https://arxiv.org/abs/2411.04997) or GME(https://arxiv.org/abs/2412.16855).
Currently, I found the image is processed using img2text API,extract image description, and retrieve like text; NOT really text-image retrieve via multimodal embedding models.