Ask-Anything Cannot reproduce videochatgpt video benchmark

Cannot reproduce videochatgpt video benchmark

Open Leo-Yuyang opened this issue 1 month ago • 2 comments

Dear author, I found that in your paper, you claimed a very impressive performance on videochatgpt video benchmark. However, I didn't find related code about reproducing this experiment. So I modified the mvbench evaluation code to inference on this task. But I can't reproduce it. b703402d502870e4daff910c65caf6a What I got is: The model is the right one because I used the same model and reproduced the performance on MVbench. So the only difference might be the different prompt used when testing the dataset, however I can't really believe that a different prompt can lead to such a big difference. So could you please give me the prompt used when testing this benchmark or the inference result of this? Extraordinary claims require extraordinary evidence.

May 16 '24 03:05 Leo-Yuyang

Ask-Anything Ask-Anything copied to clipboard

Cannot reproduce videochatgpt video benchmark

Ask-Anything
Ask-Anything copied to clipboard