Nichols A. Romero

Results 24 comments of Nichols A. Romero

@fxmarty-amd Is there a docker image I can use as a starting point to reproduce this issue?

@fxmarty-amd I took a look at this yesterday with @jeffdaily We believe that part of the problem is the way nightly wheels are packaged. They contain their own version of...

@Amund Can you please clarify where you downloading your PyTorch installation? Thanks

> [@naromero77amd](https://github.com/naromero77amd) Sure, I am using a Docker image based on rocm/pytorch:latest (seems to be rocm/pytorch:rocm7.0.2_ubuntu24.04_py3.12_pytorch_release_2.8.0) Are you using the PyTorch version that comes with that image or are you...

@Amund From *outside* the container (bare metal), can you send us the output of: ``` rocm-smi --showpids ``` from *inside* the container, can you send the output of: ``` rocm-smi...

@Amund Can you do `which rocm-smi` outside the container? Can you please confirm it is installed?

@Amund You must have the GPU driver (kernel modules) installed and loaded on the host, otherwise the container will not work properly. If you believe that you have the drivers...

Are you able to run any simple HIP program to completion? (Should be straightforward to ask chatGPT to write one for you). Ideally, it should run on the host and...

Can you try running your simple PyTorch example while setting this environment variable: `PYTORCH_NO_HIP_MEMORY_CACHING=1 python simple.py` and post the output here.

Does this program run for you in the docker image? ``` import torch # Use GPU if available device = "cuda" if torch.cuda.is_available() else "cpu" print(device) # Create a tensor...