DreamPose How to perform distributed training.

How to perform distributed training.

Open jiazheng-xing opened this issue 1 year ago • 0 comments

When running the following command on two GPUs within the same node, it still only performs single-GPU training. Why is that?

accelerate launch --num_processes=4 train.py --pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" --instance_data_dir=../path/to/dataset --output_dir=checkpoints --resolution=512 --train_batch_size=2 --gradient_accumulation_steps=4 --learning_rate=5e-6 --lr_scheduler="constant" --lr_warmup_steps=0 --num_train_epochs=300 --run_name dreampose --dropout_rate=0.15 --revision "ebb811dd71cdc38a204ecbdd6ac5d580f529fd8c"

Aug 15 '23 07:08 jiazheng-xing

DreamPose DreamPose copied to clipboard

How to perform distributed training.

DreamPose
DreamPose copied to clipboard