stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: CUDA error since Stable Diffusion 2.0 changes
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
This is a repost of issue #5097 which was closed erroneously.
Ever since the first changes made to accommodate the new v2.0 models I cannot generate an image in txt2img. I did a fresh clone on 2022-12-25 and this issue persists. I can start the web-ui and enter a prompt. After clicking generate the following occurs...
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:22<00:00, 1.11s/it]
Error completing request███████████████████████████████████████████████████████████████| 20/20 [00:18<00:00, 1.05it/s]
Arguments: ('photo of a llama', '', 'None', 'None', 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 0, 0, 0, False, False, False, False, '', 1, '', 0, '', True, False, False) {}
Traceback (most recent call last):
File "C:\AI\stable-diffusion-webui\modules\call_queue.py", line 45, in f
res = list(func(*args, **kwargs))
File "C:\AI\stable-diffusion-webui\modules\call_queue.py", line 28, in f
res = func(*args, **kwargs)
File "C:\AI\stable-diffusion-webui\modules\txt2img.py", line 49, in txt2img
processed = process_images(p)
File "C:\AI\stable-diffusion-webui\modules\processing.py", line 469, in process_images
res = process_images_inner(p)
File "C:\AI\stable-diffusion-webui\modules\processing.py", line 576, in process_images_inner
x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
File "C:\AI\stable-diffusion-webui\modules\processing.py", line 576, in <listcomp>
x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
File "C:\AI\stable-diffusion-webui\modules\processing.py", line 404, in decode_first_stage
x = model.decode_first_stage(x)
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "C:\AI\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage
return self.first_stage_model.decode(z)
File "C:\AI\stable-diffusion-webui\modules\lowvram.py", line 52, in first_stage_model_decode_wrap
return first_stage_model_decode(z)
File "C:\AI\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode
dec = self.decoder(z)
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 631, in forward
h = self.mid.attn_1(h)
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\AI\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 258, in forward
out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op)
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\xformers\ops.py", line 862, in memory_efficient_attention
return op.forward_no_grad(
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\xformers\ops.py", line 305, in forward_no_grad
return cls.FORWARD_OPERATOR(
File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\_ops.py", line 143, in __call__
return self._op(*args, **kwargs or {})
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
This is my webui-user.bat
@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--xformers --medvram
rem git pull
call webui.bat
I do have an ancient video card (GTX 970) but I am able to use the web-ui if I git pull git reset --hard 828438b4a190759807f9054932cae3a8b880ddf1 (more than a month stale now) but there are many new features and I'm running into compatibility issues with models and extensions. Is there any hope this will be addressed?
Steps to reproduce the problem
- Go to txt2img
- Type prompt
- Click Generate
What should have happened?
Not a Cuda error
Commit where the problem happens
c6f347b81f584b6c0d44af7a209983284dbb52d2
What platforms do you use to access UI ?
Windows
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
No response
Additional information, context and logs
No response
Yes, you cannot run SD2.0 with a gpu so old (saw you using M40 and Kelper series - those are not even supported by latest pytorch anymore)
Not trying to use a v2.0 model just the old 1.5 I use every day on the old commit
Yes because the SD2.0 update switched to the SD2.0 version of the SD repo, which uses some unsupported operators
So... That's it? No path forward? 不好。
The problem is I cannot reproduce your issue, and I believe not many others can. it is more related to your own system than to this repo. Given your old GPU, it is more likely your system. Until you upgrade to a newer GPU (Pascal), you have to stay in the old commit.
As you can see yourself in the traceback, repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py, the error is not from this repo's code, but from https://github.com/Stability-AI/stablediffusion. If you really want a fix, you should raise a issue over there. This repo is merely calling their code. It is impossible for anyone in this repo to fix this issue.
Been looking to upgrade for a while but you can't even get a 30 series card anymore. I'm not great at coding and I didn't notice that the error is with Stability-AI's code... Doubt I'll get much love over there but I'll give it a shot. ETA: I know you CAN get a 30 series card... I'm just not gonna pay what they're asking for yesterday's model, and nobody in their right mind should buy a 4080 and 4090s are a myth. Nobody's ever seen one,
Just did a check... repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py is identical to the current commit and my old working directory. There are a lot of lines files referenced in that traceback. Are you sure that's the problem spot? ETA: in fact all the files in that folder are identical (except the pyc cache files)
xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op)
Xformers does not support the SD2.0 attention unit (which is different from SD1.0) on older GPUs
Not sure what you mean by 'identical', but it is perfectly normal for it to match the online version, after all, it is just cloned from the online version. This simply means their repo currently also have the same issue.
You can see xformer repo: https://github.com/facebookresearch/xformers
the TORCH_CUDA_ARCH_LIST env variable is set to the architures that you want to support. A suggested setup (slow to build but comprehensive) is export TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6"
This means the oldest supported CUDA version is 6.0, your GTX 970 is 5.2, which is unsupported. It is already unexpected that it works for SD1.0 (it is not officially supported), but with SD2.0 it is normal to not work.
You may try to build xformers locally for your older CUDA by export TORCH_CUDA_ARCH_LIST="5.2" though. It may be able to compile to older CUDA kernels. If you are using the wheels from this repo (built by @C43H66N12O12S2 ), it will not work.
Not sure what you mean by 'identical', but it is perfectly normal for it to match the online version, after all, it is just cloned from the online version. This simply means their repo currently also have the same issue.
Identical meaning WinMerge (a comparison program) found the files to be the same. At any rate I can't really complain to Stability AI like you suggested when it seems that xformers is my problem (if I'm following you correctly).
You may try to build xformers locally for your older CUDA by export TORCH_CUDA_ARCH_LIST="5.2" though. It may be able to compile to older CUDA kernels. If you are using the wheels from this repo (built by @C43H66N12O12S2 ), it will not work.
This sounds like big-boy stuff... I have no idea where to begin but I'll look into that then
You can try disabling xformers. What do you mean by the files are same? If you are comparing the content of repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py between the older commit of this repo and the current latest, of course they will be same. Any file under repositories\stable-diffusion-stability-ai will NOT change if you do git checkout for this repo. They are independent. This repo uses a symlink to link to the SD repo, which will not get updated unless you explicitly run git checkout under that SD repo.
What do you mean by the files are same?
I used winmerge to compare the repositories\stable-diffusion-stability-ai folders. all contents except cache files and images are identical
I have another folder with my working webui to compare against
would using the xformers built there work?
would using the xformers built there work?
Built where?
on my working (pre2.0) commit
It does not matter. Building xformers is irrelevant to which commit you have for SD webui and SD repo. All that matters is the build config, which is controlled by environment variable TORCH_CUDA_ARCH_LIST
blink I think I'm wasting your time. I understood none of that.
HOWEVER I can generate an image if I don't call xformers in my webui-user.bat file. I don't even know if I can be sure that xformers ever did work with my old commit. No erros come up during startup or on image generation there though. Is there any way to tell if xformers are active in the web-ui?
If you see no errors, it worked. As I said, the problem is just with SD 2.0 codes. And now that you verified without xformer works, then it must be with xformers not supporting your older GPU for SD2.0
Thanks for your time. I'm could've sworn I noticed an improvement to my performance on the old commit after adding --xformers to the bat. Maybe it never changed anything and I imagined it. I'll see if I the new commit functions as well as the old one then. Sorry for wasting your time (on Xmas no less). Feel kinda silly if that was my problem for more than a month, ETA: tried building xformers... nope: error: legacy-install-failure
I have the same problem @TheMundaneDave I have a GTX 970 it used to work great with xformers until a few days ago when stable-diffusion-webui got updated. When you remove the --xformers from the webui-user.bat it will work again but it will be slower without xformers.
This version of V2.0
https://github.com/cmdr2/stable-diffusion-ui
Does work great on my GTX-970 and with xformers and it does all of the installing for you.
I have the same problem @TheMundaneDave I have a GTX 970 it used to work great with xformers until a few days ago when stable-diffusion-webui got updated. When you remove the --xformers from the webui-user.bat it will work again but it will be slower without xformers.
This version of V2.0
https://github.com/cmdr2/stable-diffusion-ui
Does work great on my GTX-970 and with xformers and it does all of the installing for you.
GTX 980 user here, had same issue with cuda error. Decided to update ui today using git pull, saw that it's broken, just reverted git pull and currently using following commit:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/9b384dfb5c05129f50cc3f0262f89e8b788e5cf3
Instruction in case you did git pull and want to revert to last working commit you can do:
git reflog show
It will output you something like:
git reflog show
ce9827a (HEAD -> master, origin/master, origin/HEAD) HEAD@{0}: pull: Fast-forward
9b384df HEAD@{1}: clone: from https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
So 9b384df is the hash of commit before the pull. To reset it you need to type:
git reset --hard 9b384df
Don't forget to add xformers/medvram/split attention in webui-user after reset. Hope that it helps until somebody fixes it or until we can afford new GPU :D
Finally managed to compile and install xformers properly on latest version of stable-diffusion-webui and wrote installation guide from scratch. Not sure if this will be obsolete in near future or if it can be applied to build on architectures below Maxwell but at the moment it works. Steps 5 to 7 are redundant since you can make venv and install right version of pytorch beforehand or even better - edit dependencies to download pytorch 1.13.1+cu117 instead of constant downloading and uninstallation, but i am too lazy.
Stable diffusion setup with xformers support on older GPU (Maxwell, etc) for Windows
-
Check your videocard CUDA version support under "CUDA-Enabled GeForce and TITAN Products":
https://developer.nvidia.com/cuda-gpus
I tested it on GTX 980 so CUDA 5.2 should work, not sure about any lower CUDA version(5.0 and below)
-
Install git:
https://gitforwindows.org/
-
Clone repo:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git -
Install python 3.10:
https://www.python.org/downloads/
Make sure it's 3.10.X
-
Launch webui-user.bat once, let it make venv, download dependencies and install them.
-
Open webui url, generate 1 image, close it after it finishes.
-
Open powershell.exe , type:
cd stable-diffusion-webui\venv\Scripts .\Activate.ps1 pip uninstall torch* pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117 -
Install Visual Studio 2022 Community Edition, during installation select "Desktop development with C++"":
- https://visualstudio.microsoft.com/downloads/
-
(Optional) Install ninja build to speed up compilation speed:
- Download ninja-win.zip from https://github.com/ninja-build/ninja/releases and unzip it.
- Place ninja.exe under C:\Windows OR add the full path to the extracted ninja.exe into system PATH
- Run ninja -h in cmd and verify if you see a help message printed
-
Launch cmd.exe as admin and type :
git config --system core.longpaths true -
Launch regedit.exe, check:
"HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\LongPathsEnabled"Make sure it's set to 1 and it's type is REG_DWORD
-
Open powershell.exe, type:
cd C:\stable-diffusion-webui\venv\Scripts .\Activate.ps1 -
Inside powershell.exe with venv enabled(step 7) type:
set TORCH_CUDA_ARCH_LIST=5.2 pip install -v -U "git+https://github.com/facebookresearch/xformers.git@main#egg=xformers" -
If stars are right and everything is installed you should check cuda versions and features support:
python -c "import torch; print(torch.__version__)"1.13.1+cu117And for xformers:
python -m xformers.infoIt should look something like:
A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' xFormers 0.0.16+6f3c20f.d20230116 memory_efficient_attention.cutlassF: available memory_efficient_attention.cutlassB: available memory_efficient_attention.flshattF: available memory_efficient_attention.flshattB: available memory_efficient_attention.smallkF: available memory_efficient_attention.smallkB: available memory_efficient_attention.tritonflashattF: unavailable memory_efficient_attention.tritonflashattB: unavailable swiglu.fused.p.cpp: available is_triton_available: False is_functorch_available: False pytorch.version: 1.13.1+cu117 pytorch.cuda: available gpu.compute_capability: 5.2 gpu.name: NVIDIA GeForce GTX 980 build.info: available build.cuda_version: 1107 build.python_version: 3.10.2 build.torch_version: 1.13.1+cu117 build.env.TORCH_CUDA_ARCH_LIST: None build.env.XFORMERS_BUILD_TYPE: None build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: None source.privacy: open source -
Add xformers to webui-user.bat
- Open webui-user.bat and add
--xformersinset COMMANDLINE_ARGS= - Example webui-user.bat:
echo off et PYTHON= et GIT= et VENV_DIR= et COMMANDLINE_ARGS=--opt-split-attention --xformers all webui.bat - Save it.
- Open webui-user.bat and add
-
Launch webui-user.bat and hope that it will work
Such a build guide already exists in the wiki, fyi
Such a build guide already exists in the wiki, fyi
Yes, i saw it.
However it misses fix for windows that prevents git from pulling xformers(too long filenames), setting right version of cuda (specifically cuda 5.2) and has even more redundant steps(like making separate venv, installing old pytorch into it, building wheel only to install it later in webui venv) and also states that you should use --force-enable-xformers which is currently broken and will disable xformers due to import error:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5898#issuecomment-1368054928
Good catches. Looks like the wiki could use some update. @ClashSAN