InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Results 170 InternVideo issues
Sort by recently updated
recently updated
newest added

I would like to join the group chat, but it has already reached the 200-member limit. Could we create a new group?

Hi, I realize that the demo use the InternVideo2-Stage2_1B-224p-f4 model. Can i directly change the model to OpenGVLab/InternVideo2-Stage2_6B to change num_frames to 8? I hope to see better result since...

Hi @leexinhao, I am trying the text to video retrieval on my dataset using this https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/demo_video_text_retrieval.ipynb, but the similarity scores are coming in very low between the text_features and the...

in InternVideo2.5 you have an example video that parts are highlighted do you have any example code how to do something similar?

为什么我用 https://huggingface.co/OpenGVLab/InternVideo2-Stage2_6B 中6B的模型提取视频特征进行搜索 (检索代码参考:https://huggingface.co/OpenGVLab/InternVideo2-Stage2_6B/blob/main/demo.py) 还没有 https://huggingface.co/OpenGVLab/InternVideo2-Stage2_1B-224p-f4 中的1B的模型的效果好(检索代码参考:https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/demo_video_text_retrieval.ipynb)?求解答

Hi, For the InternVideo2-S/B/L encoders: what value was used for `sep_image_video_pos_embed`? It seems like this was set true in the 1b/6b models, but false in S/B/L I am trying to...

Hi, InternVideo2 is a great work, and I appreciate your contribution to the community. However, I can't find relevant code for using the 2 teachers in stage 1. As said...

Hello, First of all, I sincerely apologize for the duplicate issue on GitHub and Hugging Face. I found the model at [this link](https://huggingface.co/OpenGVLab/InternVideo2-Stage2_6B) on Hugging Face and proceeded to run...

can you give an example about how to use model OpenGVLab/InternVideo2-Stage2_6B-224p-f4? Thanks.