cleanrl
cleanrl copied to clipboard
Improve compatibility with CUDA-enabled pytorch on non-CUDA devices
Problem Description
The pytorch CUDA-enabled libraries are more capable than the CPU-only one. They can also run on the CPU if no CUDA device is available. However, due to a race condition, the current code base calls into the CUDA driver even if one passes --no-cuda
as an argument.
The issues are these lines of code: https://github.com/vwxyzjn/cleanrl/blob/8cbca61360ef98660f149e3d76762350ce613323/cleanrl/dqn.py#L147
They should first check if the flag is set and only then call torch.cuda.is_available
. That way, the program runs perfectly fine in those scenarios.
Possible Solution
device = torch.device("cuda" if args.cuda and torch.cuda.is_available() else "cpu")
Could you clarify why this is a "race condition"?
it shouldn't be cuda
unless both cases are true so I don't understand why ordering would matter in this case
Because Python's and
is lazy. If the first option is false, than it doesn't evaluate the second one. If I pass --no-cuda
, Python still runs torch.cuda.is_available
, even though I specified that I do not want to use CUDA.
On devices that don't have CUDA drivers (or have the stub drivers) but have the CUDA version of PyTorch installed, this throws a runtime error. However, using the CPU on such devices is valid, since the PyTorch library can still function by using the CPU.
Ok, that makes sense, I thought that pytorch would be smart enough to not raise a runtime error for this function. I would make a PR that makes your suggested change