LAVIS
LAVIS copied to clipboard
Reproducing BLIP2 COCO ITM Fine-tuning and Adding New Data
Hey BLIP-2 team,
Thanks for your great work! I've been trying to reproduce the BLIP2 COCO ITM fine-tuning using the resources in your repo:
I couldn't find specific instructions or a command to reproduce the COCO ITM fine-tuning. As I understand train_caption_coco.sh
relates to captioning and blip_itm_large.yaml
is BLIP1 not BLIP2. I also searched in the code and previous GitHub issues.
Could you share the exact command or script to run this?
Also, I plan to add new fine-tuning data later. Any tips on incorporating new data would be awesome.
Thanks for your help and your amazing work on BLIP-2!