Need clearly Understand of each checkpoint

Open p1k0pan opened this issue 2 years ago • 0 comments

Hi, thank you for your great work. I am little bit confused about the checkpoint that post on the repository. I saw the paper at "Pre-training Details" section, pretrianed dataset is 14M including COCO, Flickr.... which match the checkpoint with the link https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_14M.pth right?

Also did model_base_14M and model_base (https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base.pth) all use CapFilt?

Thank you for your help

Nov 12 '23 12:11 p1k0pan