tch-rs
tch-rs copied to clipboard
Unexpected Cuda(0) Usage
I was using a machine with eight graphics cards to train a model. However, I didn't know how to use 2 or more GPU at the same time to train a single model (this is another problem). So I chose cuda(3) in my code to train a model, but after the bold line of code in the figure below, which forward function was called, cuda(0) seemed being used. I don't know why.
And this is the forward function, No CUDA(0) is explicitly called.
In conclusion, I actually have two questions.
- why cuda(0) is called in net.forward() function
- how to use 2 GPU to train one model at the same time to speed up.
I'm not sure how the C++ library converts the gpu device ids to what appears in nvidia-smi so cannot really comment here. Re multi-gpu training, did you try using both cuda(1) and cuda(2) in the same code to see if two gpus get used?