examples
                                
                                 examples copied to clipboard
                                
                                    examples copied to clipboard
                            
                            
                            
                        "RuntimeError: HIP error: invalid device function" when running "mnist" on 7900XTX
Context
- Pytorch version: 2.6.0+rocm6.2.4
- Operating System and version: Ubuntu 24.04.2 LTS x86_64
Your Environment
- Installed using source? [yes/no]: no
- Are you planning to deploy it using docker container? [yes/no]: no
- Is it a CPU or GPU environment?: GPU
- Which example are you using: mnist
- Link to code or data to repro [if any]: mnist
Expected Behavior
Train Epoch: 1 [0/60000 (0%)]	Loss: 2.326473
Train Epoch: 1 [640/60000 (1%)]	Loss: 1.377825
Train Epoch: 1 [1280/60000 (2%)]	Loss: 0.828890
Train Epoch: 1 [1920/60000 (3%)]	Loss: 0.623807
Train Epoch: 1 [2560/60000 (4%)]	Loss: 0.447925
Train Epoch: 1 [3200/60000 (5%)]	Loss: 0.293224
Train Epoch: 1 [3840/60000 (6%)]	Loss: 0.163648
Train Epoch: 1 [4480/60000 (7%)]	Loss: 0.633399
Train Epoch: 1 [5120/60000 (9%)]	Loss: 0.226126
Train Epoch: 1 [5760/60000 (10%)]	Loss: 0.226796
...
Current Behavior
Traceback (most recent call last):
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 147, in <module>
    main()
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 138, in main
    train(args, model, device, train_loader, optimizer, epoch)
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 45, in train
    output = model(data)
             ^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/examples/mnist/main.py", line 25, in forward
    x = self.conv1(x)
        ^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 554, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/USER/Desktop/PYTHON Document/PhyRevE/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 549, in _conv_forward
    return F.conv2d(
           ^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Possible Solution
export HIP_VISIBLE_DEVICES=1
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export PYTORCH_ROCM_ARCH="gfx1100"
But it doesn't work for me.
Steps to Reproduce
- Install the lastest pytorch by pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4
- clone examplesand cd the directory.
- python3 mnist/main.py
Failure Logs [if any]
Output of AMD_LOG_LEVEL=3 python main.py
AMD_LOG.log