taming-transformers
taming-transformers copied to clipboard
gpu training hangs
training with --gpus 0,
--gpus 0,1
hangs at initializing ddp ...
I've tried, change ddp
to dp
, change nccl
to gloo
, upgrade lightning or torch version... still not work.
(only cpu works)
@shyakocat I encountered the same issue, have you resolved it?
try running export NCCL_P2P_DISABLE=1
before running the script
it worked for me