can internvideo-2.5 support mult-images as input?

Open dongdk opened this issue 11 months ago • 1 comments

hi the example given in the HF, is using a video path as input, so can it support multi-images as input? if support, can give the code example to how to using it? thanks.

Feb 05 '25 08:02 dongdk

they convert the mp4 files to individual frames and that is passed to the model. See https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/demo_video_text_retrieval.ipynb

Nov 13 '25 18:11 charlielito