torchquad
torchquad copied to clipboard
Let user choose which GPU to use
Feature
Desired Behavior / Functionality
Currently, one can only enable or disable cuda and also only do so globally using the torchquad.set_up_backend
function. First of all, this means that even on multi-GPU machines one can only ever use the first device "cuda:0". Secondly, it means that using torchquad can break existing code that one tries to integrate it into, because the set_up_backend
function globally changes how torch Tensors are initialized. Instead I propose to include device
as an optional argument in the integrate function.
What Needs to Be Done
Unfortunately, I am not familiar enough with the library's code to make informed comments on how this can be implemented. I suspect that it's actually a fairly difficult request.
Hi Alex!
While this may theoretically be feasible, there is an easier way.
In your shell session / env just set the CUDA_VISIBLE_DEVICES environment variable.
E.g. to use device 2 on
(On Linux, but similar on Windows I think?): export CUDA_VISIBLE_DEVICES=2
For more info see (here)[https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/]
Ofc this only works for single GPU applications still. Multi GPU would be more complicated indeed.
Hope this helps? :)
Thanks for the quick suggestion. Unfortunately, this still doesn't solve the issue because of the other point that I mentioned, i.e. if I did this, then all tensors would by default be instantiated on the chosen device, as opposed to the CPU by default. Of course, for each specific tensor instantiation this is easy to fix, but simply plugging the torchquad library into a large existing project can thus break a lot of things (which happens to be the case for me), unless one goes through the code and makes each CPU instantiation explicit.
Hmmm. Yes I see what you mean. Then the problem is a bit more the torchquad behavior of setting the default device inside torch, I guess? One thing you could try, that I have not tested though, is to never call torchquad.set_up_backend
? :thinking:
That should avoid setting the default behavior in torch.
Yes, I had indeed tried this earlier (i.e. running your minimal example but with that line commented out), but as best as I can tell it then runs the entire computation on the CPU (judging by the fact that integral_value
is stored on the CPU), at which point I might as well use some scipy integrator.
Anyway, I now hacked together an integrator that is sufficient for my purposes but I imagine the suggested feature would still be useful if it could be implemented.