MathVista icon indicating copy to clipboard operation
MathVista copied to clipboard

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Results 8 MathVista issues
Sort by recently updated
recently updated
newest added

I had been working more closely with this repo a few weeks ago and thought I would try to contribute some of the modifications back for others to benefit. ##...

- Returning None when extraction is empty prevents choosing one of the choices based on Levenshtein distance - Also return None on str coercion failure since returning empty string would...

I was debugging an issue with our model outputting empty responses for all questions and noticed the accuracy score was still 22% when I expected it should be 0%. I...

The `get_response` function takes `image_path` but the variable is unused. I assumed it would be useful if targeting another LMM like GPT4V; however, the code to set the image path...

More of an optimization rather than bug or issue with evaluation, but I think worth noting in case someone thinks it is worthy to address. generate_response.py and extract_answer.py use an...

There is an implementation in `utilities#get_chat_response` and `models/gpt#get_response`. These could be unified https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/utilities.py#L159-L199 https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/models/gpt.py#L16-L55

Hi I was wondering the score of GPT-4O, it's 63.8 on testmini. But I could only get around 55 at my side. Also I got little bit lower score for...