InternVideo
InternVideo copied to clipboard
can internvideo-2.5 support mult-images as input?
hi the example given in the HF, is using a video path as input, so can it support multi-images as input? if support, can give the code example to how to using it? thanks.
they convert the mp4 files to individual frames and that is passed to the model. See https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/demo_video_text_retrieval.ipynb