Haodong Duan comments

Results 129 comments of


                                            Haodong Duan

Support Blink Dataset

Hi, @Ezra-Yu , thanks for your suggestion and we will have a review on this dataset.

dify support

Close the PR since there is no response during a long time.

best to have docker image

Hi, @max-yue , Yeah, we plan to build a docker for VLMEvalKit.

best to have docker image

@Mor-Li It would be good if we can have one or several docker files than can successfully run the evaluation of tens of representative VLMs.

There is a long gap between the validation accuracy of the dataset of vlmevalkit and the model paper

Hi, @YongLD , Actually, the support of VQA datasets is still in progress (we only share some preliminary results for now). We still cannot obtain the corresponding accuracies reported by...

There is a long gap between the validation accuracy of the dataset of vlmevalkit and the model paper

> @kennymckormick Can we use the azure openai key in VlmEvalKit? How can I change the base_url of azure? Currently, VLMEvalKit does not support openai api key (cuz I do...

There is a long gap between the validation accuracy of the dataset of vlmevalkit and the model paper

> I've noticed that most of the results are dissimilar compared to those in the research paper. I believe that the framework that's in use should be rectified, making it...

ChartQA augmented & CMMMU

Hi, @KosumosuL , ChartQA augmented is now incorporated. However, there is no news from the developers of CMMMU.

The code for evaluating open question of MMMU is completely wrong

Hi, @xwwu2015 , The current implementation is a fast one and the problem you mentioned do occur, lead to not so accurate performance results for open-ended questions. It will not...

Evaluation of custom models and datasets.

Hi, @juxingyiwan , we are going to first implement for custom multiple-choice datasets, and will create a tutorial for it. Stay Tuned!