MathVista
MathVista copied to clipboard
MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
I had been working more closely with this repo a few weeks ago and thought I would try to contribute some of the modifications back for others to benefit. ##...
- Returning None when extraction is empty prevents choosing one of the choices based on Levenshtein distance - Also return None on str coercion failure since returning empty string would...
I was debugging an issue with our model outputting empty responses for all questions and noticed the accuracy score was still 22% when I expected it should be 0%. I...
The `get_response` function takes `image_path` but the variable is unused. I assumed it would be useful if targeting another LMM like GPT4V; however, the code to set the image path...
More of an optimization rather than bug or issue with evaluation, but I think worth noting in case someone thinks it is worthy to address. generate_response.py and extract_answer.py use an...
There is an implementation in `utilities#get_chat_response` and `models/gpt#get_response`. These could be unified https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/utilities.py#L159-L199 https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/models/gpt.py#L16-L55
Hi I was wondering the score of GPT-4O, it's 63.8 on testmini. But I could only get around 55 at my side. Also I got little bit lower score for...
socre -> score
Hello, could you please explain how prefetch_rate is calculated, what it represents, and what it can indicate?
``` [18:38:19] INFO [root] MathVista: Extract Answers - Start usage: extract_answer.py [-h] [--results_file_path RESULTS_FILE_PATH] [--response_label RESPONSE_LABEL] [--max_num_problems MAX_NUM_PROBLEMS] [--quick_extract] [--rerun] [--save_every SAVE_EVERY] [--azure_openai_api_endpoint AZURE_OPENAI_API_ENDPOINT] [--azure_openai_api_key AZURE_OPENAI_API_KEY] [--azure_openai_api_version AZURE_OPENAI_API_VERSION] [--azure_openai_model AZURE_OPENAI_MODEL]...