[Win] Does not show GPU profiling
Describe the bug Scalene shows CPU profiling but not GPU on windows11.
To Reproduce
python 3.10.12
scalene 1.5.29
pytorch-lightning 1.8.1
torch 2.1.0.dev20230830+cu118
torchaudio 2.1.0.dev20230830+cu118
torchmetrics 0.11.4
torchvision 0.16.0.dev20230830+cu118
scalene xx.py
scalene --profile-all xx.py
scalene --cpu --gpu xx.py
Desktop (please complete the following information):
-
OS: Windows11 Pro Insider Preview Build 23536
-
GPU: RTX2070super
-
Tried with the repository version (
pip install git+https://github.com/plasma-umass/scalene)
Additional context
NOTE: The GPU is currently running in a mode that can reduce Scalene's accuracy when reporting GPU utilization.
Run once as Administrator or root (i.e., prefixed with sudo) to enable per-process GPU accounting.
Please post the output of nvidia-smi. Try running nvidia-smi while your program is executing; you should be able to see GPU utilization above 0 in the output of nvidia-smi. If not, that would mean that your program is not actually exercising the GPU. Also, please run pip freeze and report the version info for pyvnml. Thanks.
@emeryberger
pynvml==11.4.1
Nothing is jumping out at me - if you can send us a minimal working example, that would help considerably. Thanks.
@emeryberger
import torch
import torch.nn as nn
import torch.optim as optim
# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
# Define a sample neural network
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc = nn.Linear(1000, 1000)
def forward(self, x):
x = self.fc(x)
return x
# Create a large model
model = Net().to(device)
# Generate a large random input tensor
input_data = torch.randn(10000, 1000).to(device)
# Move the model and input data to the GPU
model = model.to(device)
input_data = input_data.to(device)
# Perform a large number of forward and backward passes
optimizer = optim.SGD(model.parameters(), lr=0.01)
criterion = nn.MSELoss()
for _ in range(1000):
optimizer.zero_grad()
output = model(input_data)
loss = criterion(output, input_data)
loss.backward()
optimizer.step()
Thanks for the report! I tracked down the issue (I inadvertently introduced a UI issue a few days ago while resolving another one). I'll be pushing a new release out momentarily.
@emeryberger Thank you for the update.
However, it seems like the problem still persist. I have tried running the code above just using scalene code.py on 2 machines and no GPU profiling appears.
What version of Scalene? (scalene --version) Also, please share screenshots (and attach JSON if possible).
I'm also seeing this, and checked nvidia-smi in a seperate panel to confirm the GPU is being used. GPU column in scalene is empty.
I'm also seeing this, and checked nvidia-smi in a seperate panel to confirm the GPU is being used. GPU column in scalene is empty.
The same. I am on V100 GPU.
Do we have some solutions now? @emeryberger