tch-rs icon indicating copy to clipboard operation
tch-rs copied to clipboard

Unexpected Cuda(0) Usage

Open neveranever98 opened this issue 3 years ago • 1 comments

I was using a machine with eight graphics cards to train a model. However, I didn't know how to use 2 or more GPU at the same time to train a single model (this is another problem). So I chose cuda(3) in my code to train a model, but after the bold line of code in the figure below, which forward function was called, cuda(0) seemed being used. I don't know why.

image

And this is the forward function, No CUDA(0) is explicitly called.

image

In conclusion, I actually have two questions.

  1. why cuda(0) is called in net.forward() function
  2. how to use 2 GPU to train one model at the same time to speed up.

neveranever98 avatar Mar 15 '22 06:03 neveranever98

I'm not sure how the C++ library converts the gpu device ids to what appears in nvidia-smi so cannot really comment here. Re multi-gpu training, did you try using both cuda(1) and cuda(2) in the same code to see if two gpus get used?

LaurentMazare avatar May 14 '23 11:05 LaurentMazare