stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: RuntimeError: Torch is not able to use GPU
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
$ ./webui.sh
################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################
################################################################
Running on debian user
################################################################
################################################################
Repo already cloned, using it as install directory
################################################################
################################################################
Create and activate python venv
################################################################
################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc.so.4
Python 3.10.9 (main, Mar 1 2023, 18:23:06) [GCC 11.2.0]
Version: v1.3.2
Commit hash: baf6946e06249c5af9851c60171692c44ef633e0
Traceback (most recent call last):
File "/home/debian/project/stable-diffusion-webui/launch.py", line 38, in <module>
main()
File "/home/debian/project/stable-diffusion-webui/launch.py", line 29, in main
prepare_environment()
File "/home/debian/project/stable-diffusion-webui/modules/launch_utils.py", line 257, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
$ nvidia-smi
Fri Jun 23 22:51:58 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:01:00.0 Off | Off |
| 0% 31C P8 28W / 450W| 21MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2796 G /usr/lib/xorg/Xorg 8MiB |
| 0 N/A N/A 2825 G /usr/bin/gnome-shell 10MiB |
+---------------------------------------------------------------------------------------+
Steps to reproduce the problem
none
What should have happened?
none
Commit where the problem happens
none
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
Linux
What device are you running WebUI on?
Nvidia GPUs (RTX 20 above)
What browsers do you use to access the UI ?
No response
Command Line Arguments
none
List of extensions
none
Console logs
none
Additional information
No response
Have run into the same issue, this is from my post in discussion section (not sure why it goes to there): I tried to run webui.sh on an Ubuntu 22.04 server, and get: Traceback (most recent call last): File "/home/user/stable-diffusion-webui/launch.py", line 38, in main() File "/home/user/stable-diffusion-webui/launch.py", line 29, in main prepare_environment() File "/home/user/stable-diffusion-webui/modules/launch_utils.py", line 257, in prepare_environment raise RuntimeError( RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
But I have a GPU (RTX 3060) and think I have installed cuda correctly (have done the same in WSL enviroment of the same PC and get webui working), and oobabooga run correctly on GPU. I kind of suspect that it is because the PC have two GPU, one iGPU (togather with an AMD CPU) and one RTX 3060.
When I run sudo lshw -C display I get: *-display description: VGA compatible controller product: GA106 [GeForce RTX 3060] vendor: NVIDIA Corporation physical id: 0 bus info: pci@0000:01:00.0 logical name: /dev/fb0 version: a1 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress vga_controller bus_master cap_list rom fb configuration: depth=32 driver=nvidia latency=0 mode=3840x2160 visual=truecolor xres=3840 yres=2160 resources: iomemory:780-77f iomemory:7c0-7bf irq:86 memory:fb000000-fbffffff memory:7800000000-7bffffffff memory:7c00000000-7c01ffffff ioport:f000(size=128) memory:fc000000-fc07ffff *-display description: VGA compatible controller product: Cezanne vendor: Advanced Micro Devices, Inc. [AMD/ATI] physical id: 0 bus info: pci@0000:0d:00.0 logical name: /dev/fb0 version: c9 width: 64 bits clock: 33MHz capabilities: pm pciexpress msi msix vga_controller bus_master cap_list fb configuration: depth=32 driver=amdgpu latency=0 resolution=3840,2160 resources: iomemory:7c0-7bf iomemory:7c0-7bf irq:31 memory:7c10000000-7c1fffffff memory:7c20000000-7c201fffff ioport:e000(size=256) memory:fc500000-fc57ffff
When I am trying to run: import torch import sys print('__Python VERSION:', sys.version) print('__pyTorch VERSION:', torch.version) print('__CUDA VERSION') from subprocess import call print('__CUDNN VERSION:', torch.backends.cudnn.version()) print('__Number CUDA Devices:', torch.cuda.device_count()) print('__Devices') call(["nvidia-smi", "--format=csv", "--query-gpu=index,name,driver_version,memory.total,memory.used,memory.free"]) print('Active CUDA Device: GPU', torch.cuda.current_device()) print ('Available devices ', torch.cuda.device_count()) print ('Current cuda device ', torch.cuda.current_device())
I get an error: __Python VERSION: 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] __pyTorch VERSION: 2.0.1+rocm5.4.2 __CUDA VERSION __CUDNN VERSION: 2019000 __Number CUDA Devices: 1 __Devices index, name, driver_version, memory.total [MiB], memory.used [MiB], memory.free [MiB] 0, NVIDIA GeForce RTX 3060, 530.41.03, 12288 MiB, 1 MiB, 12043 MiB Traceback (most recent call last): File "/home/user/data/test.py", line 12, in print('Active CUDA Device: GPU', torch.cuda.current_device()) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/init.py", line 674, in current_device _lazy_init() File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/init.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: No HIP GPUs are available
I think I have tried all I could found, but the error persist. I have get webui.sh run in WSL enviroment of the same PC, so it shouldn't be hardware issue.
I 'solve' the problem by copying my stabel diffusion folder from WSL2 to the dual boot Ubuntu system, not sure why, but it work.
I think I found the solution to this issue. Mount necessary GPU-related files: Make sure to mount the appropriate NVIDIA driver files and libraries inside the container. my webui is setup in a docker container so I Added the following volume mounts to your Docker Compose file:
yaml Copy code services: stable-diffusion: ... volumes: ...
- /usr/local/nvidia/lib64:/usr/local/nvidia/lib64
- /usr/local/nvidia/bin:/usr/local/nvidia/bin ... Adjust the source paths /usr/local/nvidia/lib64 and /usr/local/nvidia/bin based on the actual locations of your NVIDIA driver files on the host system.
If you don't know where they are use this to find them: find / -name "libnvidia-*.so" 2>/dev/null find / -name "nvidia-smi" 2>/dev/null
I imagine this same solution can work for regular installs as well. Seems like pytorch doesn't know the locations of some nvidia files required to use the GPU.
I use Ubuntu 22.04, NVIDA GPU and have same issue, in my case, I install pytorch 2.0.1 without rocm5.4.2 and it works
You can change this code in webui.sh
file:
export TORCH_COMMAND="pip install torch==2.0.1+rocm5.4.2 torchvision==0.15.2+rocm5.4.2 --index-url https://download.pytorch.org/whl/rocm5.4.2"
to
export TORCH_COMMAND="pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118"
I use Ubuntu 22.04, NVIDA GPU and have same issue, in my case, I install pytorch 2.0.1 without rocm5.4.2 and it works
You can change this code in
webui.sh
file:export TORCH_COMMAND="pip install torch==2.0.1+rocm5.4.2 torchvision==0.15.2+rocm5.4.2 --index-url https://download.pytorch.org/whl/rocm5.4.2"
toexport TORCH_COMMAND="pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118"
how to edit this code from the terminal?