VLMEvalKit Finetuned LLaVA Model output

Hi, I finetuned the LLaVA-v1.5-13B model and managed to benchmark it on a custom dataset.

I noticed, however, that when benchmarking the finetuned version on that dataset, I didn't get the QA pairs of each image, like I did during the standard model evaluation on that same dataset.

Is there a way to activate those?

Thank you.

Jun 17 '25 19:06 JVal123

Hi, to evlauate a finetuned model, you need to first config it in config.py and use the corresponding model name to complish evaluation ,see #914 for more details.

Jun 18 '25 07:06 MaoSong2022

Hi! I've managed to do that already. The only difference is that during the evaluation, I don't get the QA pairs to show up on the console.

Jun 18 '25 08:06 JVal123

did you check the generate_inner function of the model definition? it is better to provide working cases to better locate the potential bugs.

Jun 19 '25 02:06 MaoSong2022

Yes. This happens with LLaVA only. I'll give you an example, below, where to the left we have part of the evaluation of the standard model and to the right we have part of the evaluation of the finetuned version. On the latter, I don't get the answer like I do in the standard model case.

Jun 19 '25 15:06 JVal123