promptbench icon indicating copy to clipboard operation
promptbench copied to clipboard

Access to per-sample evaluation results

Open adhirajghosh opened this issue 1 year ago • 1 comments

Hi, Thanks for the great work! For my current project, I am looking to use the sample-wise evaluation results of VLMs for the experiments you have conducted.

If you can provide me with the sample-wise evaluation logs on the multimodal datasets mentioned(VQAv2, NoCaps, MMMU, MathVista, AI2D, ChartQA, ScienceQA) for the models evaluated(BLIP2, LLaVA Qwen-VL, Qwen-VL-Chat, InternLM-XComposer2-VL, GPT-4v, Gemini Pro Vision, Qwen-VL-Max, Qwen-VL-Plus), I would greatly appreciate it. If I missed a dataset or model, please feel free to incorporate them.

adhirajghosh avatar Apr 14 '24 09:04 adhirajghosh

Hi, I'm sorry to tell you that we cannot provide the sample-wise evaluation logs for you.

MingxuanXia avatar Apr 14 '24 10:04 MingxuanXia

Stale issue message

github-actions[bot] avatar Jun 14 '24 06:06 github-actions[bot]