LAVIS
LAVIS copied to clipboard
any plan to open source blip2 training code for vit-L/14 image encoder?
load_model("blip2", "pretrain_vitL")~
current, only inference code is there~
You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml
You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml
thanks for the insight.
also, I downloaded Laion-multi dataset to do my own training, and find the epoch based runner_base runner does not support very large dataset, should I change to use iter based runner_iter instead? is the runner_iter code well maintained (coz I don't find much traning sample based on it)?
You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml
Also, do you know why pretrain_stage1.yaml has no validation dataset? only training dataset is used... In this case, the stage 2 always use the last checkpoint, without choosing a best checkpoint based on validation dataset?
You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml
Also, do you know why pretrain_stage1.yaml has no validation dataset? only training dataset is used... In this case, the stage 2 always use the last checkpoint, without choosing a best checkpoint based on validation dataset?
hi, I also encountered problem when pretrain blip2 when using coco and vg. Have you figured it out?