[Benchmark] Support MMReason

Open HJYao00 opened this issue 1 month ago • 0 comments

This PR adds evaluation support for the MMReason benchmark, accepted at ICCV 2025, to assess the reasoning capabilities of MLLMs.

Before submitting, I have tested that the code successfully run on Qwen2.5-VL and Qwen3-VL. The command to run the evaluation is as follows:

python3 run.py --data MMReason_testmini --model Qwen2.5-VL-7B-Instruct --verbose

python3 run.py --data MMReason_testmini --model Qwen3-VL-8B-Instruct --verbose

Nov 15 '25 11:11 HJYao00