DeepSpeed
DeepSpeed copied to clipboard
Can your own PC train a deepspeed model?[BUG]

@kuokay Can you please share the log output? The path is stated in the error message (Log output: /home/kuokay/...
)
My own computer win11 3060 graphics card, can train a 1.3b model?
---Original--- From: "Michael @.> Date: Sat, Apr 22, 2023 06:40 AM To: @.>; Cc: @.@.>; Subject: Re: [microsoft/DeepSpeed] Can your own PC train a deepspeedmodel?[BUG] (Issue #3333)
@kuokay Can you please share the log output? The path is stated in the error message (Log output: /home/kuokay/...)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Is this the 3060 with 12GB of memory? If so, you may be able to train the 1.3b model if you reduce the batch size to 1. I just tested and I was using ~12GB of memory with the following command:
deepspeed --num_gpus 1 main.py --model_name_or_path facebook/opt-1.3b --gradient_accumulation_steps 2 --lora_dim 128 --zero_stage 0 --deepspeed --output_dir ./output/ --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_checkpointing
This may still fail due to memory limitations on your system. However, we are working on support for an --offload
feature that should further reduce the memory requirements to train these models.
Issue is stale, closing.