InternVL
InternVL copied to clipboard
Questions about your model in video-mme
I noticed your latest good results on video-mme (https://video-mme.github.io/home_page.html#leaderboard), ranking 9th, the parameter size is 20B, the number of image frames is 10 frames, you also announced this good result on the github homepage, I am curious:
- How did you test your model? The model you have open-sourced seems to be a single-frame model? How to expand it to 10 frames of images?
- Which model is your 20B model? Is it released open-source?
Thank you!