pix2pix icon indicating copy to clipboard operation
pix2pix copied to clipboard

training phase multiple gpu used

Open vinjohn opened this issue 8 years ago • 2 comments

Hi,

When I start to train, I noticed that apart from the gpu I assigned in train.lua, all other gpus also occupied by 200 to 300 mb memory. Say I have 8 gpu cards (index 0, 1, ... 7 ), and train on gpu 7, gpu 0-6 also running something with 200~300 gpu memory. I'm not sure what these "something" is, but can I assign such "something" to gpu I appoint? Since other people may use other gpu, and left no enough memory for 300MB.

Thanks anyway!

vinjohn avatar Aug 02 '17 05:08 vinjohn

The current version should only support single GPU training.

junyanz avatar Aug 05 '17 03:08 junyanz

The usual way to prevent that is to add CUDA_VISIBLE_DEVICES=0 to your line that calls the .lua training script (meaning CUDA_VISIBLE_DEVICES=[index of GPU starting from 0]) - like that Torch will only see that particular GPU and not spam the others.

Quasimondo avatar Aug 18 '17 14:08 Quasimondo