LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Reproducing BLIP2 COCO ITM Fine-tuning and Adding New Data

Open yonatanbitton opened this issue 1 year ago • 3 comments

Hey BLIP-2 team,

Thanks for your great work! I've been trying to reproduce the BLIP2 COCO ITM fine-tuning using the resources in your repo:

  1. train.py
  2. blip_image_text_matching.ipynb
  3. train_caption_coco.sh
  4. blip_itm_large.yaml

I couldn't find specific instructions or a command to reproduce the COCO ITM fine-tuning. As I understand train_caption_coco.sh relates to captioning and blip_itm_large.yaml is BLIP1 not BLIP2. I also searched in the code and previous GitHub issues. Could you share the exact command or script to run this?

Also, I plan to add new fine-tuning data later. Any tips on incorporating new data would be awesome.

Thanks for your help and your amazing work on BLIP-2!

yonatanbitton avatar May 02 '23 11:05 yonatanbitton