InternVideo
InternVideo copied to clipboard
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Load-checkpoint for vision encoder shows a list containing bool False. Why is this behavior observed? 
The error occurred when I tried to fine-tune in single_modality Traceback (most recent call last): File "run_finetuning.py", line 719, in main(opts, ds_init) File "run_finetuning.py", line 636, in main train_stats =...
Thanks for your great work! This is more of a question than a bug report. Say I have a short clip (8s) of a door being opened and another one...
Hi, I was wondering when the InternVideo2 s3-1B for zero-shot video captioning weights will be released! Thank you. Chris
Thank you for your in interesting work and your shared code! I'm very confused that whether the zero-shot performance on MSRVTT reported in [here](https://github.com/OpenGVLab/InternVideo/tree/main/Downstream/Video-Text-Retrieval#our-results) requires setting “--mergeclip=True”? Below is the...
Hello, thanks for the great work! I am trying to download the InternVid 10M Dataset through yt-dlp but it seems that YouTube blocks the IP address fairly quickly, after maybe...
Error occurred while running demo.ipynb in InternVideo2's multi_modality demo. I installed packages according to requirements.txt. ``` ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 11 6 import torch 8...
What is the exact gpu memory required to run the evaluation experiment of internvideostage2 ?
Hi Could you please release the Weights for InternVideo2 6B stage2 version + License file. Thanks :)