[bounty] Video LLM for Search

Open BenraouaneSoufiane opened this issue 4 months ago • 1 comments

Enable more powerful search using visual and audio context.

[ ] Use Video LLMs like:
- [ ] Video-LLaMA
- [ ] Video-ChatGPT
- [ ] MiniGPT-4 + CLIP
[ ] Convert video to frames + audio:
- [ ] ffmpeg to extract frames/audio
[ ] Send multimodal input to the LLM
[ ] Output: searchable embeddings or semantic summaries

all the exact things that will need to be done to receive the bounty.

precision is important otherwise the bounty cannot be awarded.

/bounty 400

This is neccesary as matches with user needs

This issue is a response/relied to this issue: #1142

Aug 05 '25 10:08 BenraouaneSoufiane

@louis030195 I just created the second issue concern this #1142 if you can take a look

Aug 05 '25 11:08 BenraouaneSoufiane