"RuntimeError: No HIP GPUs are available" on AMD 6700XT (Ubuntu 22.04.2)
Specifications:
OS: Ubuntu 22.04.2 LTS
Ry-zen 7 3700X:
*-cpu description: CPU product: AMD Ryzen 7 3700X 8-Core Processor vendor: Advanced Micro Devices [AMD] physical id: 11 bus info: cpu@0 version: 23.113.0 serial: Unknown slot: AM4 size: 2794MHz capacity: 4426MHz width: 64 bits clock: 100MHz
AMD Rad-eon 6700XT:
*-display description: VGA compatible controller product: Navi 22 [Radeon RX 6700/6700 XT / 6800M] vendor: Advanced Micro Devices, Inc. [AMD/ATI] physical id: 0 bus info: pci@0000:08:00.0 logical name: /dev/fb0 version: c1 width: 64 bits clock: 33MHz capabilities: pm pciexpress msi vga_controller bus_master cap_list rom fb configuration: depth=32 driver=amdgpu latency=0 mode=1920x1080 resolution=2560,1080 visual=truecolor xres=1920 yres=1080
I have ROCm installed with: sudo amdgpu-install --usecase = hiplibsdk, rocm following AMDs instructions for Ubuntu 22.04
ROCm System Management Interface Concise Info GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0 48.0c 8.0W 500Mhz 96Mhz 0% auto 203.0W 6% 0% End of ROCm SMI Log
I installed Comfy UI following the Installation Guide for Linux.
Everything works fine until I prompt something. On prompt I get following error:
got prompt Global Step: 470000 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels ERROR STARTS HERE Traceback (most recent call last): File "/home/karl/ComfyUI/execution.py", line 185, in execute recursive_execute(self.server, prompt, self.outputs, x, extra_data, executed) File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed) File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed) File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed) File "/home/karl/ComfyUI/execution.py", line 67, in recursive_execute outputs[unique_id] = getattr(obj, obj.FUNCTION)(**input_data_all) File "/home/karl/ComfyUI/nodes.py", line 290, in load_checkpoint out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings")) File "/home/karl/ComfyUI/comfy/sd.py", line 970, in load_checkpoint_guess_config vae = VAE() File "/home/karl/ComfyUI/comfy/sd.py", line 513, in __init__ device = model_management.get_torch_device() File "/home/karl/ComfyUI/comfy/model_management.py", line 250, in get_torch_device return torch.cuda.current_device() File "/home/karl/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 674, in current_device _lazy_init() File "/home/karl/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: No HIP GPUs are available
I'm guessing this is a issue between Torch and ROCm, any things I could try and or solutions?
Update: I think the problem likely stems from PyTorch not yet supporting ROCm 5. Even if that is not the case it still should cause other errors.
Downgrading ROCm did not solve anything. I found the same issue on Automatic1111, https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8828 Currently installing an older PyTorch version. Maybe that will solve the issue. I got Auto1111 running once, but I forgot the mystery fix that made it work...
Hey older PyTorch version, new error. So this is definitely a PyTorch x ROCm issue. What a suprise...
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release
warnings.warn(
/usr/lib/python3/dist-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release
warnings.warn(
Global Step: 840000
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Traceback (most recent call last):
File "/home/karl/ComfyUI/execution.py", line 185, in execute
recursive_execute(self.server, prompt, self.outputs, x, extra_data, executed)
File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute
recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed)
File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute
recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed)
File "/home/karl/ComfyUI/execution.py", line 58, in recursive_execute
recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed)
File "/home/karl/ComfyUI/execution.py", line 67, in recursive_execute
outputs[unique_id] = getattr(obj, obj.FUNCTION)(**input_data_all)
File "/home/karl/ComfyUI/nodes.py", line 290, in load_checkpoint
out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
File "/home/karl/ComfyUI/comfy/sd.py", line 970, in load_checkpoint_guess_config
vae = VAE()
File "/home/karl/ComfyUI/comfy/sd.py", line 513, in __init__
device = model_management.get_torch_device()
File "/home/karl/ComfyUI/comfy/model_management.py", line 250, in get_torch_device
return torch.cuda.current_device()
File "/home/karl/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 552, in current_device
_lazy_init()
File "/home/karl/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 229, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice
/home/karl/.local/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
return torch._C._cuda_getDeviceCount() > 0
The pytorch ROCm builds are standalone, they don't require you to have ROCm actually installed. They only require you to have a compatible kernel.
Try launching comfyui with: HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py
The pytorch ROCm builds are standalone, they don't require you to have ROCm actually installed. They only require you to have a compatible kernel.
Try launching comfyui with:
HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py
I've been doing that the entire time. Unless i've been doing it wrong by typing HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py into terminal in the ComfyUI directory?
If you have ROCm installed uninstall it completely, it might be conflicting with the ROCm that comes bundled with the pytorch package.
If you have ROCm installed uninstall it completely, it might be conflicting with the ROCm that comes bundled with the pytorch package.
Yep, I just reinstalled the entire OS, get everything completely clean to maximize my chances. Big moment in a few minutes... Will it work?
Welp, one step forward, one step back...
New error!
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
I'm a bit lost now, this is an AMD system with a 6700XT? Do I still need an NVIDIA driver? Edit: Just takes me to the NVIDIA drivers page, sadly no Radeon 6700XT Drivers... xD
That error means you installed the wrong pytorch, what's your python version?
That error means you installed the wrong pytorch, what's your python version?
2.0.1+cu117 Guessing cu stands for Cuda, not ROCm? Did I install the NVIDIA Version??
I opened a new Issue for this different error. Makes it easier for people with the same issue later. #653
So must have been wrong PyTorch, now I'm getting the original "RuntimeError: No HIP GPUs are available " error. Clean install, followed instructions.
So I previously thought this issue might be to blame on ROCm (witch seems to be pretending to be CUDA from my very limited understanding) not working with PyTorch properly, but I did some playing around with PyTorch in Python and that seemed to work, so now I'm just confused. I have no prior knowledge in anything, but especially in PyTorch and anything ROCm, so maybe I am misunderstanding this.
This is what I did/what worked: Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import math
>>> x = torch.empty(3,4)
>>> print(x)
tensor([[ 6.7262e-44, 0.0000e+00, 6.7262e-44, 0.0000e+00],
[ 2.7758e+14, 7.0065e-45, 0.0000e+00, 0.0000e+00],
[-1.5173e+19, 4.5652e-41, 7.4849e+31, 4.5653e-41]])
>>> print(type(x))
<class 'torch.Tensor'>
>>> ones = torch.zeros(2, 2) + 1
>>> twos = torch.ones(2, 2) * 2
>>> threes = (torch.ones(2, 2) * 7 - 1) / 2
>>> fours = twos ** 2
>>> sqrt2s = twos ** 0.5
>>> print(ones)
tensor([[1., 1.],
[1., 1.]])
>>> print(twos)
tensor([[2., 2.],
[2., 2.]])
>>> print(threes)
tensor([[3., 3.],
[3., 3.]])
>>> print(fours)
tensor([[4., 4.],
[4., 4.]])
>>> print(sqrt2s)
tensor([[1.4142, 1.4142],
[1.4142, 1.4142]])
What I also realized: The instructions lists HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py, but I've been running HSA_OVERRIDE_GFX_VERSION=10.3.0 python3 main.py, because otherwise well python unhappy... In my mind this shouldn't make a difference, do I need python-is-python3 ??? But that wouldn't change anything as far as I know, right?
rocm should add support for gfx1031, until then rocm and pytorch should be compiled manually with -DCMAKE_HIP_ARCHITECTURES="gfx1031" or -DAMDGPU_TARGETS="gfx1031" variables...
well, it's just rocm.
comparing precompiled pytorch with HSA_OVERRIDE_GFX_VERSION=10.3.0 and compiled with gfx1031 support (sic!) there is difference
p.s. http://reddit.com/r/AMD_Stock/comments/136duk0/upcoming_rocm_linux_gpu_os_support/
I've successfully used ComfyUI with RX 6700 on Ubuntu 22.10 (shouldn't differ too much from 22.04). I did install ROCm 5.4.3, but gave up on compiling PyTorch (it's a world of pain and you actually do need ROCm installed to compile it anyway).
PyTroch precompiled against ROCm 5.5 is not yet available and the best you can get right now is 5.4.2 (it will work with ROCm 5.4.3 but not 5.5).
If you have different version of ROCm installed already you might want to uninstall it using:
sudo amdgpu-uninstall --rocmrelease=all
You should be able to install ROCm using those commands:
sudo apt-get update
wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb
sudo apt-get install ./amdgpu-install_5.4.50403-1_all.deb
sudo amdgpu-install --usecase=rocm,hip,mllib --no-dkms
sudo usermod -a -G video,render $LOGNAME
The last line gives your user access to the GPU which is supposedly needed by ROCm version of PyTorch. You need to restart Ubuntu after this. You can replace hip with hiplibsdk if you want. You'll need hiplibsdk to compile stuff with ROCm (like PyTorch or Ooba Booga LLaMA plugin), but it shouldn't be needed just for running ComfyUI.
In your ComfyUI folder start venv and uninstall existing PyTorch and install PyTorch with ROCm enabled.
source venv/bin/activate
pip3 uninstall torch torchvision torchaudio
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
I'm only 90% sure about the command for activating venv ;) haven't done it manually in like 2 weeks. You might want to use pip instead of pip3 if you want (on my system there was no difference).
For running ComfyUI:
source venv/bin/activate
export HSA_OVERRIDE_GFX_VERSION=10.3.0
python main.py
You can save that to an sh script and run that instead. I did something like that, but I'm writing this on Windows and I don't have access to my Ubuntu at the moment to check.
P.S. I've wasted a lot of time trying to get SD to work on RX6700 on Ubuntu and then wasted even more time to get it to work on 7900 XTX.
If you have ROCm installed uninstall it completely, it might be conflicting with the ROCm that comes bundled with the pytorch package.
emm,Do I need to uninstall amd gpu driver?Does amd gpu driver contain rocm?I am new to ubuntu
I've successfully used ComfyUI with RX 6700 on Ubuntu 22.10 (shouldn't differ too much from 22.04). I did install ROCm 5.4.3, but gave up on compiling PyTorch (it's a world of pain and you actually do need ROCm installed to compile it anyway).
PyTroch precompiled against ROCm 5.5 is not yet available and the best you can get right now is 5.4.2 (it will work with ROCm 5.4.3 but not 5.5).
If you have different version of ROCm installed already you might want to uninstall it using:
sudo amdgpu-uninstall --rocmrelease=allYou should be able to install ROCm using those commands:
sudo apt-get update wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb sudo apt-get install ./amdgpu-install_5.4.50403-1_all.deb sudo amdgpu-install --usecase=rocm,hip,mllib --no-dkms sudo usermod -a -G video,render $LOGNAMEThe last line gives your user access to the GPU which is supposedly needed by ROCm version of PyTorch. You need to restart Ubuntu after this. You can replace
hipwithhiplibsdkif you want. You'll needhiplibsdkto compile stuff with ROCm (like PyTorch or Ooba Booga LLaMA plugin), but it shouldn't be needed just for running ComfyUI.In your ComfyUI folder start venv and uninstall existing PyTorch and install PyTorch with ROCm enabled.
source venv/bin/activate pip3 uninstall torch torchvision torchaudio pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2I'm only 90% sure about the command for activating venv ;) haven't done it manually in like 2 weeks. You might want to use
pipinstead ofpip3if you want (on my system there was no difference).For running ComfyUI:
source venv/bin/activate export HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.pyYou can save that to an sh script and run that instead. I did something like that, but I'm writing this on Windows and I don't have access to my Ubuntu at the moment to check.
P.S. I've wasted a lot of time trying to get SD to work on RX6700 on Ubuntu and then wasted even more time to get it to work on 7900 XTX.
If anyone comes here do this with just changing the version
I've successfully used ComfyUI with RX 6700 on Ubuntu 22.10 (shouldn't differ too much from 22.04). I did install ROCm 5.4.3, but gave up on compiling PyTorch (it's a world of pain and you actually do need ROCm installed to compile it anyway). PyTroch precompiled against ROCm 5.5 is not yet available and the best you can get right now is 5.4.2 (it will work with ROCm 5.4.3 but not 5.5). If you have different version of ROCm installed already you might want to uninstall it using:
sudo amdgpu-uninstall --rocmrelease=allYou should be able to install ROCm using those commands:sudo apt-get update wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb sudo apt-get install ./amdgpu-install_5.4.50403-1_all.deb sudo amdgpu-install --usecase=rocm,hip,mllib --no-dkms sudo usermod -a -G video,render $LOGNAMEThe last line gives your user access to the GPU which is supposedly needed by ROCm version of PyTorch. You need to restart Ubuntu after this. You can replace
hipwithhiplibsdkif you want. You'll needhiplibsdkto compile stuff with ROCm (like PyTorch or Ooba Booga LLaMA plugin), but it shouldn't be needed just for running ComfyUI. In your ComfyUI folder start venv and uninstall existing PyTorch and install PyTorch with ROCm enabled.source venv/bin/activate pip3 uninstall torch torchvision torchaudio pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2I'm only 90% sure about the command for activating venv ;) haven't done it manually in like 2 weeks. You might want to use
pipinstead ofpip3if you want (on my system there was no difference). For running ComfyUI:source venv/bin/activate export HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.pyYou can save that to an sh script and run that instead. I did something like that, but I'm writing this on Windows and I don't have access to my Ubuntu at the moment to check. P.S. I've wasted a lot of time trying to get SD to work on RX6700 on Ubuntu and then wasted even more time to get it to work on 7900 XTX.
If anyone comes here do this with just changing the version
I must be a total retard for not fully understanding what you mean by If anyone comes here do this with just changing the version
Very very tired of trying to get everything working properly, sorry for asking what you meant with that but i'd rather ask then have to start fresh again for the 100th time..
Thanks in advance!
After following instructions on the README, I encountered the same "No HIP GPUs are available".
Doing sudo usermod -a -G video,render $LOGNAME was enough to fix it for me.
I tried so many things, but in the end all I had to do was to run it with sudo -_-
sudo $(which python) main.py