ControlNet
ControlNet copied to clipboard
RuntimeError when training on multiple GPUs
I tried to train on multiple GPUs, but when reading the data, even if I set num_workers=0, I still get the error
RuntimeError: unable to open shared memory object
and I don't have root access, so I can't increase the openfile data.
trainer = pl.Trainer(gpus=2, precision=32, callbacks=[logger])
As soon as I change gpus to 1, training works fine. Anyone have ideas?
so am i
Try increasing the shared memory size by running this on the terminal
ulimit -n 16384
- check this issue and the last comment https://github.com/lllyasviel/ControlNet/issues/165#issuecomment-1464726313