InternVideo Zero-shot retrieval reproduction issue

Zero-shot retrieval reproduction issue

Open jqsun98 opened this issue 10 months ago • 1 comments

According to the ReadMe at https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo1/Downstream/Video-Text-Retrieval, the zero-shot retrieval results will be obtained after running the command ./zeroshot_scripts/eval_msrvtt.sh. This command will execute the main_task_retrieval.py. But in "main_task_retrieval.py", I find that the model is CLIP4CLIP, instead of ViCLIP. I'd like to know how to conduct zero-shot video-text retrieval experiments with pretrained ViCLIP.

Apr 25 '24 09:04 jqsun98

Maybe you need to use the code of Internvideo2.mulitidality and add a model defintion of ViCLIP.

Apr 30 '24 02:04 leexinhao

InternVideo InternVideo copied to clipboard

Zero-shot retrieval reproduction issue

InternVideo
InternVideo copied to clipboard