InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

InternVideo2 stage2 demo model change

Open LeonCai1 opened this issue 9 months ago • 6 comments

Hi, I realize that the demo use the InternVideo2-Stage2_1B-224p-f4 model. Can i directly change the model to OpenGVLab/InternVideo2-Stage2_6B to change num_frames to 8? I hope to see better result since my videos are typically 35s to 60s long.

LeonCai1 avatar Mar 26 '25 00:03 LeonCai1

@LeonCai1 Try interpolate_pos_embed method in backbone.internvideo2.pos_embed. In my case, it worked properly.

joooooonyoung avatar Mar 27 '25 08:03 joooooonyoung

@newcommandd How to use interpolate_pos_embed. When I directly change the num_frames to 8, it run failed.

taylover-pei avatar Apr 24 '25 02:04 taylover-pei

@newcommandd The errors are as follows:

File "/InternVideo2/models/backbones/internvideo2/internvideo2.py", line 608, in forward x = x + pos_embed RuntimeError: The size of tensor a (1025) must match the size of tensor b (2049) at non-singleton dimension 1

taylover-pei avatar Apr 24 '25 02:04 taylover-pei

I have solve the problem. Thansk

taylover-pei avatar Apr 24 '25 03:04 taylover-pei

@newcommandd The errors are as follows:

File "/InternVideo2/models/backbones/internvideo2/internvideo2.py", line 608, in forward x = x + pos_embed RuntimeError: The size of tensor a (1025) must match the size of tensor b (2049) at non-singleton dimension 1

@taylover-pei I meet the same error when I change the demo code to 1b_clip ,can you kindly show me how to solve it ?

MxLearner avatar Apr 26 '25 04:04 MxLearner

@taylover-pei Me too..

ilileun avatar May 13 '25 05:05 ilileun