lighteval
lighteval copied to clipboard
[EVAL] Add MathVista
Evaluation short description
- Why is this evaluation interesting? MathVista is one of the most popular and commonly reported benchmark in MultiModal LLM like Qwen2.5_VL and InternVL3_5
- How used is it in the community? To evaluate multimodal vision reasoning capabilities on Math questions
Evaluation metadata
Provide all available
- Paper url:https://arxiv.org/pdf/2310.02255
- Github url: https://github.com/lupantech/MathVista
- Dataset url: https://huggingface.co/datasets/AI4Math/MathVista
https://mathvista.github.io/
hey @ThakurRajAnand @NathanHB can i work on this?