VILA icon indicating copy to clipboard operation
VILA copied to clipboard

Can I fine-tune NVILA wiht multiple-images?

Open BaamPark opened this issue 6 months ago • 0 comments

I read the instructions https://github.com/NVlabs/VILA/tree/main/finetuning but it only shows how fine-tune with single image-QA set. As NVILA can take multiple images as input for inference, would it be possible to fine-tune with multiple images?

BaamPark avatar Jun 08 '25 05:06 BaamPark