[Question]: images and tables
Hi My questions may sound very specific as I am working on a very interesting use case on the top of ragflow. Is there a way to extract the images from the pdf so if I ask in the chat to see the images I get the images only (now it says there are images in the pdf)? Even as I manually added images (uploaded them) to files and linked them to the knowledge base, the chat still didn't recognize that. (It will also makes more sense to be able to automate bulk action to link uploaded files to knowledge base rather than one by one.
And it will be more powerful to have the tables in the document also extracted as separate chunks.
Is my impression right that when you increase response length in the setting that impact the response quality? And how to solve that if you are pulling the answer using api calls?
Getting the images alone in chat is very interesting.
For chat API, you can get image id in response by which you can get the image by 'GET' /v1/document/get/
Yeah I'm also working on something similar. I'm planning to extract an image from a pdf file in the knowledge base, and then hopefully feed that image to the llm for analysis (instead of using OCR cuz it's somewhat lossy for images, screenshots, etc.). Is that doable with ragflow? Especially the extracting image part.
The images have already been recognized and extracted. The chunk list of a doc show them all.