kotaemon
kotaemon copied to clipboard
[REQUEST] Add image captioning
Reference Issues
No response
Summary
When a document is decomposed and image are extracted, you could also extract captions for each and store the text in the document, this will provide better results.
Basic Example
We could support BLIP and OpenaAI Vision API initially.
Drawbacks
Since the text doesn't exist in the original file, we should show that the text refers to an image.
Additional information
No response