Giyeong Oh comments

Results 40 comments of


                                            Giyeong Oh

Multi GPU train of flux Have Some Bugs

> hi @BootsofLagrangian I am using Windows. Could you please share your accelerate config? And an example of the run script for training? Thank you so much! Here are a...

how is multi gpu loss gathered?

Accelerate does this thing. In [accelerator.accumulate](https://github.com/huggingface/accelerate/blob/159c0dd02a42c30545821b7287376fe4be04d5ee/src/accelerate/accelerator.py#L1046) context manager, accelerate synchronize gradients and loss via [sync_gradients](https://github.com/huggingface/accelerate/blob/159c0dd02a42c30545821b7287376fe4be04d5ee/src/accelerate/accelerator.py#L1020). sd-scripts utilizes accelerate from Hugging Face, it is very helpful to do high-level distributed learning.

Multiple GPU setup help

aceelerate launch --num_processes=[NUM_YOUR_GPUS_PER_MACHINE] --num_machines=[NUM_YOUR_INDEPENDENT_MACHINES] --multi_gpus --gpu_ids=[GPU_IDS] "train_network.py" args... If you have 4 gpus and one machine, give args as accelerate launch --num_processes=4 --multi_gpu --num_machines=1 --gpu_ids=0,1,2,3 "train_network.py" args...

Multiple GPU setup help

> > aceelerate launch --num_processes=[NUM_YOUR_GPUS_PER_MACHINE] --num_machines=[NUM_YOUR_INDEPENDENT_MACHINES] --multi_gpus --gpu_ids=[GPU_IDS] "train_network.py" args... > > If you have 4 gpus and one machine, give args as accelerate launch --num_processes=4 --multi_gpu --num_machines=1 --gpu_ids=0,1,2,3 "train_network.py"...

Multiple GPU setup help

@BotLifeGamer Here is a example command lines for training lora `accelerate launch --num_processes=2 --multi_gpu --num_machines=1 --gpu_ids=0,1 "train_network.py" --pretrained_model_name_or_path=[huggingface_path or base model path to use] --network_module=networks.lora --save_model_as=safetensors --caption_extension=".txt" --seed="42" --training_comment=[some comment...

Giyeong Oh

Multi GPU train of flux Have Some Bugs

how is multi gpu loss gathered?

Multiple GPU setup help

Multiple GPU setup help

Multiple GPU setup help

Multiple GPU setup help

Multiple GPU setup help

Training on 2x H100 on Ubuntu and speed is same as 1x H100 what we are doing wrong?

Training on 2x H100 on Ubuntu and speed is same as 1x H100 what we are doing wrong?

Training on 2x H100 on Ubuntu and speed is same as 1x H100 what we are doing wrong?