stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: a1111 is broken on AMD cards after git pull to 1.7.0
Checklist
- [X] The issue exists after disabling all extensions
- [X] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
- [X] The issue exists in the current version of the webui
- [X] The issue has not been reported before recently
- [ ] The issue has been reported before but has not been fixed yet
What happened?
It seems like something broke or changed in the code after the 1.7.0 release, all I got was this error and I can't find a solution anywhere. I can only open the UI with --skip-torch-cuda-test, but this results in long generation times as it uses CPU and not GPU, even a rollback to 1.6.0 doesn't work anymore and has the same problem
ERROR:
Steps to reproduce the problem
1.go to webui-user.bat files location 2. launch it
What should have happened?
WebUi should start without problem
What browsers do you use to access the UI ?
No response
Sysinfo
Console logs
venv "D:\AIS\AUTOMATIC1111\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Traceback (most recent call last):
File "D:\AIS\AUTOMATIC1111\stable-diffusion-webui\launch.py", line 48, in <module>
main()
File "D:\AIS\AUTOMATIC1111\stable-diffusion-webui\launch.py", line 39, in main
prepare_environment()
File "D:\AIS\AUTOMATIC1111\stable-diffusion-webui\modules\launch_utils.py", line 384, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
Press any key to continue . . .
Additional information
i havent done or installed anything besides using git pull to update the UI to 1.7.0
Same problem here, I was following this guide https://youtu.be/Po-ykkCLE6M?si=KHUQNKIR0rjik8bZ to install stable diffusion locally
same problem
Also had this problem on Ubuntu and Windows 11 both. Solved it by adding in webui.sh one string (before installing).
After this:
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
Add this:
export TORCH_COMMAND="pip install torch==2.1.1+rocm5.6 torchvision==0.16.1+rocm5.6 --index-url https://download.pytorch.org/whl/rocm5.6"
Worked on Ubuntu. On Windows didn't test.
Also had this problem on Ubuntu and Windows 11 both. Solved it by adding in webui.sh one string (before installing).
After this:
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
Add this:export TORCH_COMMAND="pip install torch==2.1.1+rocm5.6 torchvision==0.16.1+rocm5.6 --index-url https://download.pytorch.org/whl/rocm5.6"
Worked on Ubuntu. On Windows didn't test.
Windows doesn't have a package for 2.1.1+rocm5.6:
and
I'm still prodding for a Windows solution, but it does not seem that that installing torch 2.1.1 and torchvision 0.16.1 alone will do the trick. I'll update this if I find a way to make this work.
Same issue for me, I can get it to work with --no-half, and --skip-torch-cuda-test. That being said, what used to take 40 seconds now takes 9 minutes.
Same issue for me, I can get it to work with --no-half, and --skip-torch-cuda-test. That being said, what used to take 40 seconds now takes 9 minutes.
yes, same here is because with --skip-torch-cuda-test it only uses CPU which is slow. I hope we find a way to make it work again
Workaround: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/340#issuecomment-1869107163
It forces directml. Made my instance usable again. Not sure if it's a fix, but it gets me back to where I was.
Workaround: lshqqytiger#340 (comment)
It forces directml. Made my instance usable again. Not sure if it's a fix, but it gets me back to where I was.
yeah fixed for me too
As --use-ipex
was introduced on the upstream, I added --use-directml
and made CUDA/ROCm as default. So if you want to skip CUDA test and let torch use DirectML device, simply add --use-directml
in your command line arguments.
Workaround: lshqqytiger#340 (comment)
It forces directml. Made my instance usable again. Not sure if it's a fix, but it gets me back to where I was.
So that link has nice instructions that I skipped to the end on AND IT WORKED!! Bottom line is all I did in the Webui-user.bat was add '--use-directml' to the command args. Everything else I left at my normal setup. Back to 30 seconds give or take, instead of 9 minutes. I am on Ryzen9 ( 3900x) and Rx5700xt, Windows. Good luck hope it's this simple for everyone.
Did anyone find a solution for linux?
Same issue I was trying to get XL-Turbo working and I put "git pull" before "call webui.bat" to update. RX 570 8g on Windows 10
Fix: webui-user.bat set COMMANDLINE_ARGS= --lowvram --use-directml
Notes: I had already deleted the venv folder so it reinstalled pytorch again the first time I ran webui-user.bat after adding --use-directml to the commandline-args.
Also I removed all my other args and I was still able to text2image, albeit a bit slower then before. precision full, no half, no half vae, sub quad attention, nan check, upcast, etc and I'll find out how much of that stuff I still need. The settings tab is easier to make sense of now so hopefully we don't still need so many args just to get inpainting and controlnet to function.
Did anyone find a solution for linux?
there's no problem with running SD webui with AMD on linux, unless you got wrong torch version
just go to your SD webui virtual environment and do pip uninstall torch torchvision torchaudio pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
with this you'll get Torch 2.3.0 version and can use it with FP8 storage optimization on SD Webui
This might help others but see my comment on another AMD related issue on how to fix torch to use the directml version: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/14462#issuecomment-1872405104
Did anyone find a solution for linux?
there's no problem with running SD webui with AMD on linux, unless you got wrong torch version
just go to your SD webui virtual environment and do pip uninstall torch torchvision torchaudio pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
with this you'll get Torch 2.3.0 version and can use it with FP8 storage optimization on SD Webui
I already have it installed. If I run just ./webui.sh
I am getting this error:
stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_meta_registrations.py", line 4815, in zeros_like
res.fill_(0)
RuntimeError: HIP error: shared object initialization failed
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing HIP_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
I've tried playing with various parameters, but I've only managed to break it more. I have a 7900 XTX. What is your setup?
w> > > Did anyone find a solution for linux?
there's no problem with running SD webui with AMD on linux, unless you got wrong torch version just go to your SD webui virtual environment and do pip uninstall torch torchvision torchaudio pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7 with this you'll get Torch 2.3.0 version and can use it with FP8 storage optimization on SD Webui
I already have it installed. If I run just
./webui.sh
I am getting this error:stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_meta_registrations.py", line 4815, in zeros_like res.fill_(0) RuntimeError: HIP error: shared object initialization failed HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
I've tried playing with various parameters, but I've only managed to break it more. I have a 7900 XTX. What is your setup?
create txt file inside sd webui folder with this:
#!/bin/sh
source venv/bin/activate
export HSA_OVERRIDE_GFX_VERSION=11.0.0 export HIP_VISIBLE_DEVICES=0 export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.8,max_split_size_mb:512
python3 launch.py --enable-insecure-extension-access --opt-sdp-attention
and then save as launch.sh
open terminal and do:
bash launch.sh
since your GPU is RDNA 3, the correct GFX version is 11.0.0 like above
@zakusworo thank you, though it seems that I still have some issues. This is the result when I run it:
Python 3.10.13 (main, Dec 21 2023, 15:23:51) [GCC 13.2.1 20230801]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Launching Web UI with arguments: --enable-insecure-extension-access --opt-sdp-attention
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Style database not found: /home/mati/projects/stable-diffusion-webui/styles.csv
Loading weights [6ce0161689] from /home/mati/projects/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 13.1s (prepare environment: 5.9s, import torch: 3.1s, import gradio: 1.1s, setup paths: 0.9s, other imports: 0.7s, load scripts: 0.3s, create ui: 0.4s, gradio launch: 0.6s).
Opening in existing browser session.
Creating model from config: /home/mati/projects/stable-diffusion-webui/configs/v1-inference.yaml
Applying attention optimization: sdp... done.
Then after writing a prompt and clicking generate, nothing in the terminal changes. It only maxes out the graphics pipe and command processor + 1 CPU core to 100%:
Any ideas?
@zakusworo thank you, though it seems that I still have some issues. This is the result when I run it:
Python 3.10.13 (main, Dec 21 2023, 15:23:51) [GCC 13.2.1 20230801] Version: v1.7.0 Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e Launching Web UI with arguments: --enable-insecure-extension-access --opt-sdp-attention no module 'xformers'. Processing without... no module 'xformers'. Processing without... No module 'xformers'. Proceeding without it. Style database not found: /home/mati/projects/stable-diffusion-webui/styles.csv Loading weights [6ce0161689] from /home/mati/projects/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors Running on local URL: http://127.0.0.1:7860 To create a public link, set `share=True` in `launch()`. Startup time: 13.1s (prepare environment: 5.9s, import torch: 3.1s, import gradio: 1.1s, setup paths: 0.9s, other imports: 0.7s, load scripts: 0.3s, create ui: 0.4s, gradio launch: 0.6s). Opening in existing browser session. Creating model from config: /home/mati/projects/stable-diffusion-webui/configs/v1-inference.yaml Applying attention optimization: sdp... done.
Then after writing a prompt and clicking generate, nothing in the terminal changes. It only maxes out the graphics pipe and command processor + 1 CPU core to 100%:
Any ideas?
which version is your AMD GPU driver? make sure it's same version with ROCm version.
to change --skip-torch-cuda-test
to --use-directm
in COMMANDLINE_ARGS of webui-user.bat - helped to me - https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/340#issuecomment-1869107163
Same issue, wasn't able to get other work arounds working on Fedora. Stuck in CPU mode, any idea when a fix will be merged?
which version is your AMD GPU driver? make sure it's same version with ROCm version.
Wdym? I thought can install them separately.
Anyway, I'm now using 6.7.2-arch1-1
kernel with Mesa 23.3.4-arch1.2
. I've tried it with both rocm 5.7
and now 6.0
(some people reported to have 6.0
rocm working on torch that uses 5.7), same exact issue.
First of all, on Windows you should use lshqqytiger's fork: https://github.com/lshqqytiger/stable-diffusion-webui-directml The main version is only meant for Nvidia users, or for AMD users wich are using ROCm on Linux.
Workaround: lshqqytiger#340 (comment)
It forces directml. Made my instance usable again. Not sure if it's a fix, but it gets me back to where I was.
N.B.: that comment was referred to lshqqytiger's version. It's different from the main one.
which version is your AMD GPU driver? make sure it's same version with ROCm version.
Wdym? I thought can install them separately.
Anyway, I'm now using
6.7.2-arch1-1
kernel withMesa 23.3.4-arch1.2
. I've tried it with bothrocm 5.7
and now6.0
(some people reported to have6.0
rocm working on torch that uses 5.7), same exact issue.
Mesa isn't related at all, and all the kernel stuff related to amdgpu are... Well, in the kernel.
That version should be just fine (actually it should be the first kernel version working since a recent fix for amd cards).
I read in another thread that you had installed pytorch using ArchLinux's version. I should work, i guess, but in my (working) setup i used a different approch. ROCm is installed using the opencl-amd from the AUR, and pytorch is installed in the venv trough pip (the webui script should do that automatically)
You can also try to uncomment the TORCH_COMMAND line in webui-user.sh and change it to this:
export TORCH_COMMAND="pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
Guys, I don't know if it's the same problem. But my friend is having problems installing on an AMD GPU.
I asked him to send me his configs, and it's a very weak computer, I don't know if that would be the problem with the installation. I have NVIDIA, so I had no problem.
He followed the installation steps from a standard guide right here on GitHub.
The problem below:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/159564467/58e129df-819f-45ea-a04b-a271e90fddb2
PC Settings:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/159564467/08cdf167-95ec-4ee4-88c1-7ad2915595bc
Do you think there is a solution? I hope so. Anyway, thanks!
Guys, I don't know if it's the same problem. But my friend is having problems installing on an AMD GPU.
I asked him to send me his configs, and it's a very weak computer, I don't know if that would be the problem with the installation. I have NVIDIA, so I had no problem.
He followed the installation steps from a standard guide right here on GitHub.
The problem below:
GRAVACAO.ERRO.mp4 PC Settings:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/159564467/08cdf167-95ec-4ee4-88c1-7ad2915595bc
Do you think there is a solution? I hope so. Anyway, thanks!
The official Automatic1111 only works on Linux for AMD GPUs, as it needs the ROCm runtime, wich is not yet fully available on Windows.
To use automatic1111 on amd + windows right now he needs the DirectML version: https://github.com/lshqqytiger/stable-diffusion-webui-directml
This is all written in the installation instructions
Guys, I don't know if it's the same problem. But my friend is having problems installing on an AMD GPU. I asked him to send me his configs, and it's a very weak computer, I don't know if that would be the problem with the installation. I have NVIDIA, so I had no problem. He followed the installation steps from a standard guide right here on GitHub. The problem below: GRAVACAO.ERRO.mp4 PC Settings: https://github.com/AUTOMATIC1111/stable-diffusion-webui/assets/159564467/08cdf167-95ec-4ee4-88c1-7ad2915595bc Do you think there is a solution? I hope so. Anyway, thanks!
The official Automatic1111 only works on Linux for AMD GPUs, as it needs the ROCm runtime, wich is not yet fully available on Windows.
To use automatic1111 on amd + windows right now he needs the DirectML version: https://github.com/lshqqytiger/stable-diffusion-webui-directml
This is all written in the installation instructions
That's right, bro. In the installation guide there was this link in the command, notice that in the CMD in his video there is the directory with Directml. But it still gives this error. :/
I don't know how to help him, I already gave him some instructions that I found here but nothing changed according to him.
90% of errors are when reading a line of the file, both in the webui directory and in Phyton. The problem is that I don't have access to his computer to check these lines. If that's the problem. Anyway... I don't know how to help him.
Oh, i'm sorry, i didn't realized you were indeed usikg that version
Uhm... Wait a sec, is than an integrated gpu from 2012? There's no way SD could possibly work on that, it's too old and underpowered