LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

how to training BLIP2 on a single GPU of 3090

Open verigle opened this issue 1 year ago • 3 comments

how to training BLIP2 on a single GPU of 3090 with limit 24GB GPU memory

verigle avatar Mar 19 '23 12:03 verigle

We don't have experience with 3090. Depending on your application, this may and may not be possible considering the compute requirement.

dxli94 avatar Mar 29 '23 05:03 dxli94

I had some experience with 3090s. Definitely can load and train at least OPT6.7 or FLANXL with small batch_size and accum_grad_iters > 1

kondvit avatar Apr 03 '23 22:04 kondvit

@kondvit I have tried training the BLIP2 on a single GPU of 3090 with limit 24GB GPU memory, run the pretrain_stage2.sh,only opt-2.7b is not enough.And,when set batch = 32,it can be run.Is there something wrong with me or?

fmdmm avatar Apr 13 '23 10:04 fmdmm

I've managed to run pretrain_stage_2 on a single 3090 (opt-2.7b, batch size = 32, single worker), train_caption_coco crashes with OOM though. I've noticed that image size is different for these two stages, maybe this is the reason? I'm curious why and how image size is different if it's the same NN?

s0urcer avatar Jul 06 '23 10:07 s0urcer

我已成功在单个 3090(opt-2.7b,批量大小 = 32,单个工作进程)上运行 pretrain_stage_2,但 train_caption_coco 因 OOM 崩溃。我注意到这两个阶段的图像大小不同,也许这就是原因?我很好奇如果是相同的神经网络,图像大小为何以及如何不同?

我已经成功在单个 3090(opt-2.7b,批量大小 = 32,单个工作进程)上运行 pretrain_stage_2,但 train_caption_coco 因 OOM 崩溃。我注意到这两个阶段的图像大小不同,也许这就是原因?我很好奇如果是相同的神经网络,图像大小为何以及如何不同?

Hello, how can I fine-tune the BLIP2 model to achieve image capture tasks? I really need help to complete this task

shams2023 avatar Nov 06 '23 11:11 shams2023