alpaca-lora
alpaca-lora copied to clipboard
How to train on multiple GPU (w/ small vRAM)
Thanks for the great work. Do you have any recommended solutions for running finetune.py code on multiple small memory GPUs (such like finetune lamma-30b on 8xV100)?
Is it possible for pipeline parallelism in deepspeed to be easily integrated into current code?
Have you tried using torchrun?
Have you tried using torchrun?
I tried torchrun, but the DDP can not reduce the vRAM.
I am interested in this as well, bumping for visibility
same question