pytensor
pytensor copied to clipboard
Torch: Investigate if `set_default_device/with_device` is a deal-breaker
Description
The easiest way to give PyTensor users global (not fine-grained) control over CPU/GPU for the PyTorch backend would be the set_default_device/with_device. However, this may bee too slow, according to: https://github.com/pytorch/pytorch/issues/92701
We should benchmark to see if it is a problem. If yes, we may want to use a PyTensor config flag to get the same control without the PyTorch overhead.
It does seem to have more overhead than I'd like, but I don't think this sounds terrible for real functions.
This microbenchmark slows down from 8ms to 11ms on my machine:
%%timeit
# Explicit device
a = torch.zeros(3, device="cuda")
for _ in range(1000):
a = a + 1
out = a.cpu().numpy()
%%timeit
# Using default device
with torch.device("cuda"):
a = torch.zeros(3)
for _ in range(1000):
a = a + 1
out = a.cpu().numpy()