sd-scripts icon indicating copy to clipboard operation
sd-scripts copied to clipboard

Multi GPU on windows

Open MazrimCoding opened this issue 1 year ago • 2 comments

has anyone successfully got this running?

No combination of accelerator settings or even changing the script to use gloo has let me successfully fully run the script.

I did get to the point of both GPUs being loaded however the script hit an error "RuntimeError: Trying to create tensor with negative dimension" which I was not able to further troubleshoot.

Wondering if it is just not compatible for now basically, or even sensible as I have heard multi GPU for image training setups have issues with the seed not being properly shared between them.

MazrimCoding avatar Apr 10 '24 12:04 MazrimCoding

Can you attach your environments? Number of GPUs, configuration of accelerate, installed python libraries, training configuration, script for running one of sd-scripts, etc..

If you provide as detail as, you will be able to get more clear solution.

BootsofLagrangian avatar Apr 11 '24 11:04 BootsofLagrangian

According to this issue, PyTorch 2.2.1 seems to work with train_network.py. https://github.com/pytorch/pytorch/issues/116056

If 2.2.1 doesn't work, please share more details.

kohya-ss avatar Apr 21 '24 08:04 kohya-ss