pytorch-deeplab-xception icon indicating copy to clipboard operation
pytorch-deeplab-xception copied to clipboard

If one GPU can run, why not multiple gpus?

Open xupine opened this issue 6 years ago • 3 comments

One GPU can run, but multiple gpus can not run? my pytorch 1.1 and has 4 GPUs One GPU can run, but why can't more than one GPU? I have tried to set the workers to 0, batch_size to small and other methods. But it can not work? Can everybody can help me? Thanks very much

xupine avatar Jul 16 '19 09:07 xupine

@xupine do you solved it? I meet the same problem, if I use multi-gpu, the code stuck and did not run, and the gpu-util is 0

hust-kevin avatar Jul 31 '19 13:07 hust-kevin

Not sure if related, but I encountered the same problem (able to train on one gpu but not multiple) and only found that a problematic installation of cuda led to this.

EpiCabbage avatar Aug 28 '19 09:08 EpiCabbage

you should set the args --gpu-ids = '0,1,2' in train.py and ensure the CUDA_DEVICES_VISIBLE=0,1,2 in tran.sh to ensure the multi-gpu training

minushuang avatar Jun 02 '20 02:06 minushuang