If one GPU can run, why not multiple gpus?
One GPU can run, but multiple gpus can not run? my pytorch 1.1 and has 4 GPUs One GPU can run, but why can't more than one GPU? I have tried to set the workers to 0, batch_size to small and other methods. But it can not work? Can everybody can help me? Thanks very much
@xupine do you solved it? I meet the same problem, if I use multi-gpu, the code stuck and did not run, and the gpu-util is 0
Not sure if related, but I encountered the same problem (able to train on one gpu but not multiple) and only found that a problematic installation of cuda led to this.
you should set the args --gpu-ids = '0,1,2' in train.py and ensure the CUDA_DEVICES_VISIBLE=0,1,2 in tran.sh to ensure the multi-gpu training