alpaca-lora icon indicating copy to clipboard operation
alpaca-lora copied to clipboard

How to train on multiple GPU (w/ small vRAM)

Open kebijuelun opened this issue 1 year ago • 4 comments

Thanks for the great work. Do you have any recommended solutions for running finetune.py code on multiple small memory GPUs (such like finetune lamma-30b on 8xV100)?

Is it possible for pipeline parallelism in deepspeed to be easily integrated into current code?

kebijuelun avatar Mar 20 '23 12:03 kebijuelun

Have you tried using torchrun?

zachNA2 avatar Mar 20 '23 14:03 zachNA2

Have you tried using torchrun?

I tried torchrun, but the DDP can not reduce the vRAM.

kebijuelun avatar Mar 21 '23 04:03 kebijuelun

I am interested in this as well, bumping for visibility

8bit-coder avatar Mar 27 '23 03:03 8bit-coder

same question

taishiciR avatar Apr 06 '23 11:04 taishiciR