Multi-Modality-Arena icon indicating copy to clipboard operation
Multi-Modality-Arena copied to clipboard

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP...

Results 13 Multi-Modality-Arena issues
Sort by recently updated
recently updated
newest added

Hi, thanks for the efforts in the great work! I would like to ask whether you plan to open-source the Chatbot Arena conversation data. Thanks in advance! Best, Wei

Hello and thank you for your amazing work! However, I have a problem: the models are loaded well but I continue getting `NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE...

Wondering if you use the karpathy test set for Flickr30k, or a different test set in your LVLM-eHUB paper. Thanks!

Hi all, Could anyone provide with the hardware requirements to run and test these models. I am planning to run these models on Local Systems It would be great if...

Thanks for releasing this benchmark. Now we tried to compute the categorical score for each ability but found low scores on several abilities, like visual reasoning, and visual perception. We...

Hello, thanks for the great work! I was looking at [this script](https://github.com/OpenGVLab/Multi-Modality-Arena/blob/main/peng_utils/test_llava.py) for llava evaluation on Flickr30k, but am facing some issues, detailed [here](https://github.com/haotian-liu/LLaVA/issues/768). Could you please help me with...

I run the scripts on ScienceQA but it raises error: ''' File "./Multi-Modality-Arena/LVLM_evaluation/task_datasets/vqa_datasets.py", line 140, in load_save_dataset self.image_list.append(sample['image'].convert('RGB')) ^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'convert' '''

First, I really appreciate for your great contributions in LVLM field. Do you have any plan to release the visual commonsense reasoning (VCR) evaluation code? There's some elaboration about how...