WhisperFusion
WhisperFusion copied to clipboard

Published 20 hours ago •

Reame
Issues

Can we have this working with a vision language model?

Open pranav-deshpande opened this issue 6 months ago • 0 comments

Examples:

https://huggingface.co/microsoft/Phi-3-vision-128k-instruct
https://huggingface.co/LanguageBind/Video-LLaVA-7B-hf
https://huggingface.co/Vision-CAIR/MiniGPT4-Video

Jul 28 '24 13:07 pranav-deshpande