InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

mmstar dataset

Open jhxiang opened this issue 3 months ago • 1 comments

I deployed the InternVL3_5-241B-A28B model and tested it on the mmstar benchmark, but the accuracy I got was only 73.76%. I'd like to know how the paper came up with the 77.9% accuracy, whether I should enable thinking mode, or what the generation config parameters are.

jhxiang avatar Sep 06 '25 02:09 jhxiang

Thanks for raising this. In our experiments for the paper, we enabled Thinking Mode when evaluating InternVL3_5-241B-A28B on the MMStar benchmark. The unified Thinking Mode generation parameters are as follows:

max_new_tokens = 65536 do_sample = True temperature = 0.6 top_p = 0.95

Sorr7maker avatar Sep 08 '25 13:09 Sorr7maker