InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

Questions about your model in video-mme

Open zmj1203 opened this issue 8 months ago • 1 comments

I noticed your latest good results on video-mme (https://video-mme.github.io/home_page.html#leaderboard), ranking 9th, the parameter size is 20B, the number of image frames is 10 frames, you also announced this good result on the github homepage, I am curious:

  1. How did you test your model? The model you have open-sourced seems to be a single-frame model? How to expand it to 10 frames of images?
  2. Which model is your 20B model? Is it released open-source? Thank you! 捕获

zmj1203 avatar Jul 03 '24 11:07 zmj1203