MaoSong2022 comments

Results 16 comments of


                                            MaoSong2022

Turn Off Saving Images in LMUData

Hi, currently, all images will be converted into an image file before feed into LLM. The process happens in `build_prompt` function in each dataset, for example: ```python if self.meta_only: tgt_path...

[Benchmark]: Support CharXiv Benchmark (#921)

fixed

[Benchmark]: Support CharXiv Benchmark (#921)

fix judge model argument error ```python judge_model = judge_kwargs.get("model", "gpt-4o-mini") ``` now the default judge model will be `gpt-4o-mini`

fix(image_mcp.py): use consistant argument (#929)

the change is to keep the arguments consistent with `ImageBaseDataset` initialization. For now, only a few dataset implements this method. - `ImageBaseDataset` initialization: `def __init__(self, dataset='MMBench', skip_noimg=True)` - some other...

The inference on the AMBER dataset is very slow.

Hi, thanks for pointing out the problem. I checked the the original AMBER dataset, they didn't provide such suffix like "Please answer yes or no.". For consistency, we did not...

The inference on the AMBER dataset is very slow.

> That somehow makes sense, I can add this additional instruction to the test prompt of AMBER. @kennymckormick I can help to do this

Did not detect the .env file when using torchrun

I test with `torchrun --nproc-per-node=1 run.py --data ChartQA_TEST --model Eagle-X5-7B --verbose` and it works fine. Please check if the env file `/data3/xxf/VLMEvalKit/.env` exists

Did not detect the .env file when using torchrun

If you fine-tune a supported model using VLMEvalKit, evaluating it should be straightforward. You'll need to define your model, inherit from the base model architecture, and specify the path to...

关于qwen2.5omni的性能

你好，VLMEvalKit不保证能复现原始论文中的结果，影响结果的原因有很多，包括采样设置，prompt等原因等。你可以在[qwen2.5-omni Github](https://github.com/QwenLM/Qwen2.5-Omni) 提Issue反应相应问题。

internvl3-1b model test

Is the problem occured again? it seems there are some bugs on GPU configuration.