LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

any plan to open source blip2 training code for vit-L/14 image encoder?

Open ldfandian opened this issue 1 year ago • 4 comments

load_model("blip2", "pretrain_vitL")~

current, only inference code is there~

ldfandian avatar Jul 11 '23 07:07 ldfandian

You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml

LiJunnan1992 avatar Jul 13 '23 01:07 LiJunnan1992

You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml

thanks for the insight.

also, I downloaded Laion-multi dataset to do my own training, and find the epoch based runner_base runner does not support very large dataset, should I change to use iter based runner_iter instead? is the runner_iter code well maintained (coz I don't find much traning sample based on it)?

ldfandian avatar Jul 15 '23 13:07 ldfandian

You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml

Also, do you know why pretrain_stage1.yaml has no validation dataset? only training dataset is used... In this case, the stage 2 always use the last checkpoint, without choosing a best checkpoint based on validation dataset?

ldfandian avatar Jul 15 '23 13:07 ldfandian

You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml

Also, do you know why pretrain_stage1.yaml has no validation dataset? only training dataset is used... In this case, the stage 2 always use the last checkpoint, without choosing a best checkpoint based on validation dataset?

hi, I also encountered problem when pretrain blip2 when using coco and vg. Have you figured it out?

cactusycy avatar Feb 18 '24 12:02 cactusycy