scalene [Win] Does not show GPU profiling

Describe the bug Scalene shows CPU profiling but not GPU on windows11.

To Reproduce python 3.10.12 scalene 1.5.29 pytorch-lightning 1.8.1
torch 2.1.0.dev20230830+cu118
torchaudio 2.1.0.dev20230830+cu118
torchmetrics 0.11.4
torchvision 0.16.0.dev20230830+cu118

scalene xx.py scalene --profile-all xx.py scalene --cpu --gpu xx.py

Desktop (please complete the following information):

OS: Windows11 Pro Insider Preview Build 23536
GPU: RTX2070super
Tried with the repository version (pip install git+https://github.com/plasma-umass/scalene)

Additional context NOTE: The GPU is currently running in a mode that can reduce Scalene's accuracy when reporting GPU utilization. Run once as Administrator or root (i.e., prefixed with sudo) to enable per-process GPU accounting.

Sep 12 '23 10:09 HeChengHui

Please post the output of nvidia-smi. Try running nvidia-smi while your program is executing; you should be able to see GPU utilization above 0 in the output of nvidia-smi. If not, that would mean that your program is not actually exercising the GPU. Also, please run pip freeze and report the version info for pyvnml. Thanks.

Sep 12 '23 12:09 emeryberger

@emeryberger pynvml==11.4.1

Sep 13 '23 10:09 HeChengHui

Nothing is jumping out at me - if you can send us a minimal working example, that would help considerably. Thanks.

Sep 13 '23 17:09 emeryberger

@emeryberger

import torch
import torch.nn as nn
import torch.optim as optim

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

# Define a sample neural network
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = nn.Linear(1000, 1000)

    def forward(self, x):
        x = self.fc(x)
        return x

# Create a large model
model = Net().to(device)

# Generate a large random input tensor
input_data = torch.randn(10000, 1000).to(device)

# Move the model and input data to the GPU
model = model.to(device)
input_data = input_data.to(device)

# Perform a large number of forward and backward passes
optimizer = optim.SGD(model.parameters(), lr=0.01)
criterion = nn.MSELoss()

for _ in range(1000):
    optimizer.zero_grad()
    output = model(input_data)
    loss = criterion(output, input_data)
    loss.backward()
    optimizer.step()

Sep 15 '23 08:09 HeChengHui

Thanks for the report! I tracked down the issue (I inadvertently introduced a UI issue a few days ago while resolving another one). I'll be pushing a new release out momentarily.

Sep 15 '23 17:09 emeryberger

@emeryberger Thank you for the update.

However, it seems like the problem still persist. I have tried running the code above just using scalene code.py on 2 machines and no GPU profiling appears.

Sep 18 '23 08:09 HeChengHui

What version of Scalene? (scalene --version) Also, please share screenshots (and attach JSON if possible).

Sep 18 '23 17:09 emeryberger

@emeryberger Scalene version 1.5.31.1 (2023.09.15)

profile.zip

Sep 19 '23 02:09 HeChengHui

I'm also seeing this, and checked nvidia-smi in a seperate panel to confirm the GPU is being used. GPU column in scalene is empty.

Jan 27 '24 09:01 FergusFettes

I'm also seeing this, and checked nvidia-smi in a seperate panel to confirm the GPU is being used. GPU column in scalene is empty.

The same. I am on V100 GPU.

Do we have some solutions now? @emeryberger

May 03 '24 07:05 chenzhuofu