stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: (WSL) xFormers wasn't build with CUDA support

Open TheRealShieri opened this issue 2 years ago • 34 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

Had to reinstall webui cause git pull was not updating everything properly. Used --xformers, installed normally, generate a image, Will get a long string of error saying that xFormers wasn't build with Cuda support. Manually try to reinstall xFormers, Still same issue.

Steps to reproduce the problem

--xformers in webui-user.sh generate a image

What should have happened?

Xformers should be properly build for Cuda support

Commit where the problem happens

dac59b9b073f86508d3ec787ff731af2e101fbc

What platforms do you use to access UI ?

Linux

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

--listen --api --opt-split-attention --xformers

Additional information, context and logs

Error completing request Arguments: ('task(csqxfz4flxydfi1)', -', 'None', 'None', 30, 15, False, False, 1, 1, 11, -1.0, -1.0, 0, 0, 0, False, 512, 696, False, 0.7, 2, 'Latent', 0, 0, 0, 0, False, False, False, False, '', 1, '', 0, '', True, False, False) {} Traceback (most recent call last): File "/home/shieri/stable-diffusion-webui/modules/call_queue.py", line 56, in f res = list(func(*args, **kwargs)) File "/home/shieri/stable-diffusion-webui/modules/call_queue.py", line 37, in f res = func(*args, **kwargs) File "/home/shieri/stable-diffusion-webui/modules/txt2img.py", line 52, in txt2img processed = process_images(p) File "/home/shieri/stable-diffusion-webui/modules/processing.py", line 480, in process_images res = process_images_inner(p) File "/home/shieri/stable-diffusion-webui/modules/processing.py", line 609, in process_images_inner samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts) File "/home/shieri/stable-diffusion-webui/modules/processing.py", line 801, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)) File "/home/shieri/stable-diffusion-webui/modules/sd_samplers.py", line 544, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={ File "/home/shieri/stable-diffusion-webui/modules/sd_samplers.py", line 447, in launch_sampling return func() File "/home/shieri/stable-diffusion-webui/modules/sd_samplers.py", line 544, in samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={ File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m denoised = model(x, sigmas[i] * s_in, **extra_args) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/modules/sd_samplers.py", line 350, in forward x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond={"c_crossattn": [tensor[a:b]], "c_concat": [image_cond_in[a:b]]}) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps return self.inner_model.apply_model(*args, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, **cond) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward out = self.diffusion_model(x, t, context=cc) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward h = module(h, emb, context) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward x = layer(x, context) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward x = block(x, context=context[i]) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/modules/sd_hijack_checkpoint.py", line 4, in BasicTransformerBlock_forward return checkpoint(self._forward, x, context) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 235, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 96, in forward outputs = run_function(*args) File "/home/shieri/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 262, in _forward x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x File "/home/shieri/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/shieri/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 293, in xformers_attention_forward out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None) File "/home/shieri/stable-diffusion-webui/repositories/xformers/xformers/ops/fmha/init.py", line 197, in memory_efficient_attention return _memory_efficient_attention( File "/home/shieri/stable-diffusion-webui/repositories/xformers/xformers/ops/fmha/init.py", line 293, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/home/shieri/stable-diffusion-webui/repositories/xformers/xformers/ops/fmha/init.py", line 309, in _memory_efficient_attention_forward op = _dispatch_fw(inp) File "/home/shieri/stable-diffusion-webui/repositories/xformers/xformers/ops/fmha/dispatch.py", line 95, in _dispatch_fw return _run_priority_list( File "/home/shieri/stable-diffusion-webui/repositories/xformers/xformers/ops/fmha/dispatch.py", line 70, in _run_priority_list raise NotImplementedError(msg) NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 5568, 8, 40) (torch.float16) key : shape=(1, 5568, 8, 40) (torch.float16) value : shape=(1, 5568, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support flshattF is not supported because: xFormers wasn't build with CUDA support tritonflashattF is not supported because: xFormers wasn't build with CUDA support requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 40

TheRealShieri avatar Jan 18 '23 06:01 TheRealShieri

what gpu do you have?

ClashSAN avatar Jan 18 '23 06:01 ClashSAN

Originally I had dual 3090's for this, but I pulled one out when I was reinstalling. Xformers should just work when --xformers is placed in webui-user.sh. Cause that is what I did a bit over a few weeks ago with no issues, but I have no idea what changed to cause this issue.

TheRealShieri avatar Jan 18 '23 06:01 TheRealShieri

trying with python 3.10.6? is this a SD2.0 model?

ClashSAN avatar Jan 18 '23 06:01 ClashSAN

Its on linux, I could not get 3.10.6, it would only install the latest 3.10.9, Running the ./webui.sh would use 3.9.12. The model is also just a SD1.5 model.

TheRealShieri avatar Jan 18 '23 06:01 TheRealShieri

3.10.X is supposed to work, so since its linux, you built the xformers wheel manually before, and it worked, but doesn't anymore?

check your .whl filename, maybe it was a different python version? (Your error log shows your venv is python3.8)

ClashSAN avatar Jan 18 '23 07:01 ClashSAN

Alright, sorry if im making little sense, its late and i've been at this for a few hours. With setting the python version in the webui-user.sh to 3.10, I would get this error

Python 3.10.9 (main, Dec 7 2022, 01:12:00) [GCC 9.4.0] Commit hash: dac59b9b073f86508d3ec787ff731af2e101fbcc Installing torch and torchvision Traceback (most recent call last): File "/home/shieri/stable-diffusion-webui/launch.py", line 316, in <module> prepare_environment() File "/home/shieri/stable-diffusion-webui/launch.py", line 225, in prepare_environment run(f'"{python}" -m {torch_command}', "Installing torch and torchvision", "Couldn't install torch") File "/home/shieri/stable-diffusion-webui/launch.py", line 65, in run raise RuntimeError(message) RuntimeError: Couldn't install torch. Command: "/usr/bin/python3.10" -m pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113 Error code: 1 stdout: <empty> stderr: Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/lib/python3/dist-packages/pip/__main__.py", line 16, in <module> from pip._internal.cli.main import main as _main # isort:skip # noqa File "/usr/lib/python3/dist-packages/pip/_internal/cli/main.py", line 10, in <module> from pip._internal.cli.autocompletion import autocomplete File "/usr/lib/python3/dist-packages/pip/_internal/cli/autocompletion.py", line 9, in <module> from pip._internal.cli.main_parser import create_main_parser File "/usr/lib/python3/dist-packages/pip/_internal/cli/main_parser.py", line 7, in <module> from pip._internal.cli import cmdoptions File "/usr/lib/python3/dist-packages/pip/_internal/cli/cmdoptions.py", line 19, in <module> from distutils.util import strtobool ModuleNotFoundError: No module named 'distutils.util'

TheRealShieri avatar Jan 18 '23 07:01 TheRealShieri

Also it could just be the case of WSL ubuntu being stupid. (base) shieri@Shieri:~$ sudo apt-get install --reinstall python3-distutils E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 17084 (apt-get) N: Be aware that removing the lock file is not a solution and may break your system. E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?

Nothing is running at the moment.

TheRealShieri avatar Jan 18 '23 07:01 TheRealShieri

Do you absolutely need to run with WSL?

ClashSAN avatar Jan 18 '23 07:01 ClashSAN

Alright, looks like I got 3.10.9 working, but for some reason it would not find my gpu

Traceback (most recent call last): File "/home/shieri/stable-diffusion-webui/launch.py", line 316, in prepare_environment() File "/home/shieri/stable-diffusion-webui/launch.py", line 228, in prepare_environment run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'") File "/home/shieri/stable-diffusion-webui/launch.py", line 89, in run_python return run(f'"{python}" -c "{code}"', desc, errdesc) File "/home/shieri/stable-diffusion-webui/launch.py", line 65, in run raise RuntimeError(message) RuntimeError: Error running command. Command: "/usr/bin/python3.10" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'" Error code: 1 stdout: stderr: Traceback (most recent call last): File "", line 1, in AssertionError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

Really this just ended up being a whole rabbit hold of reinstalling and uninstalling old versions, same versions etc just to somehow get it working again. Also looks like there might be a issue with me being on CUDA version 12?

TheRealShieri avatar Jan 18 '23 07:01 TheRealShieri

yes, use 11.8

ClashSAN avatar Jan 18 '23 07:01 ClashSAN

Im not sure what to run to downgrade to 11.8 when I follow nvidia's instructsion

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb sudo dpkg -i cuda-keyring_1.0-1_all.deb sudo apt-get update sudo apt-get -y install cuda

Nvidai-smi still reports that its on cuda verison 12

TheRealShieri avatar Jan 18 '23 07:01 TheRealShieri

Did you set this up with xformers working before in WSL? why not use the old .whl and old commit or you forgot which one it was?

ClashSAN avatar Jan 18 '23 07:01 ClashSAN

I did have xformers working before. It was working on a older commit, but I dont remember which version, since I had a issue where it was truncating my prompt past 75 tokens. That is when I used git pull, but it was not properly updating the files which then I was told by people in the project ai server to just reinstall the entire thing.

TheRealShieri avatar Jan 18 '23 07:01 TheRealShieri

so you deleted the folder? :(

ClashSAN avatar Jan 18 '23 07:01 ClashSAN

Sadly yes, I was told so ;-;

TheRealShieri avatar Jan 18 '23 08:01 TheRealShieri

if you remember the date maybe we could figure the commit version, or you can keep trying

ClashSAN avatar Jan 18 '23 08:01 ClashSAN

I could try to see if I can find a old screenshot of the version, I do know that it was before the change where height and width was only in steps of 64. Im also gonna keep trying since I had the issue with truncated prompt for a long time. Even tho I started and installed webui after that was apparently fixed?

TheRealShieri avatar Jan 18 '23 08:01 TheRealShieri

Yeah, im not able to find much on the older commit version that it was on. Now it really much is just test more tomorrow and try to see if I can downgrade cuda from 12 to 11.8 or lower.

TheRealShieri avatar Jan 18 '23 08:01 TheRealShieri

K, idk why but for some reason wsl ubuntu would not let me downgrade cuda from 12 to 11.8 even with a full cuda reinstall and driver reinstall.

TheRealShieri avatar Jan 18 '23 21:01 TheRealShieri

I have the same issue with linux mint 21. I'm running python 3.8 in an anaconda env with cudatoolkit 11.3.1.

BrahRah avatar Jan 21 '23 16:01 BrahRah

Same problem. Running GNU/Debian 11.

fubuki4649 avatar Jan 23 '23 07:01 fubuki4649

Same problem, things I've tried below. It started today and I'm sure it's an easy fix, but it might not be, so:

  • used --reinstall-xformers --xformers
  • tried with no extensions
  • git pulled stable, xformers - also pip install -r requirements.txt on both to be safe
  • new nvidia studio drivers (1.23.23 - Today!)
  • Nvida 3090 asus strix, happened after latest driver update.
  • I do have WSL, and I'm uninstalling - it did work with WSL before I updated my Studio Drivers
    • I will follow up if this changes anything.
  • I just upgraded CUDA to test (12.0 +docs, runtime, vs integration)
    • I did this after the error, and it did not affect the situation
  • I'm running Windows 10 x64 pro
  • Python 3.10.9 - Windows Store version due to a PATH shitshow that kinda forced my hand

slopcop avatar Jan 23 '23 18:01 slopcop

Could possible try:

  1. removing the venv
  2. launching stable diffusion once (to install all the necessary base packages)
  3. install torch/torchvision (could try using pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/cu117 to manually install it, even though its for CUDA11.7 it might work?)
  4. install xformers via pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers

In one of these steps maybe there will be some sort of error that shows up that might be swallowed otherwise

atensity avatar Jan 23 '23 18:01 atensity

renaming the venv folder to x-venv fixed the issue for me.

slopcop avatar Jan 23 '23 19:01 slopcop

unfortunately those steps you outlined, @atensity , didn't work for me, i'm getting this error:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/streams/memory.py", line 94, in receive
    return self.receive_nowait()
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/streams/memory.py", line 89, in receive_nowait
    raise WouldBlock
anyio.WouldBlock

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/base.py", line 77, in call_next
    message = await recv_stream.receive()
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/streams/memory.py", line 114, in receive
    raise EndOfStream
anyio.EndOfStream

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
    await super().__call__(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/base.py", line 106, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/mnt/das/slow/stable-diffusion/modules/api/api.py", line 80, in log_and_time
    res: Response = await call_next(req)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/base.py", line 80, in call_next
    raise app_exc
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/base.py", line 69, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/gzip.py", line 26, in __call__
    await self.app(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/routing.py", line 706, in __call__
    await route.handle(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/fastapi/routing.py", line 165, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/mnt/das/slow/stable-diffusion/modules/api/api.py", line 187, in text2imgapi
    processed = process_images(p)
  File "/mnt/das/slow/stable-diffusion/modules/processing.py", line 476, in process_images
    res = process_images_inner(p)
  File "/mnt/das/slow/stable-diffusion/modules/processing.py", line 614, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/mnt/das/slow/stable-diffusion/modules/processing.py", line 809, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/mnt/das/slow/stable-diffusion/modules/sd_samplers.py", line 544, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/mnt/das/slow/stable-diffusion/modules/sd_samplers.py", line 447, in launch_sampling
    return func()
  File "/mnt/das/slow/stable-diffusion/modules/sd_samplers.py", line 544, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/modules/sd_samplers.py", line 350, in forward
    x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond={"c_crossattn": [tensor[a:b]], "c_concat": [image_cond_in[a:b]]})
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward
    h = module(h, emb, context)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward
    x = block(x, context=context[i])
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 259, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 129, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/mnt/das/slow/stable-diffusion/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 262, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/das/slow/stable-diffusion/modules/sd_hijack_optimizations.py", line 293, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 195, in memory_efficient_attention
    return _memory_efficient_attention(
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 291, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 307, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 95, in _dispatch_fw
    return _run_priority_list(
  File "/mnt/das/slow/stable-diffusion/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 70, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 5184, 8, 40) (torch.float16)
     key         : shape=(1, 5184, 8, 40) (torch.float16)
     value       : shape=(1, 5184, 8, 40) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    triton is not available
    requires A100 GPU
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 40

Daxiongmao87 avatar Jan 25 '23 04:01 Daxiongmao87

I'm getting this error Win10 running just via CLI, so it doesn't look completely isolated to WSL Update: (was accidentally running Python3.6 but the error came up as this anyway)

SirProdigle avatar Feb 02 '23 17:02 SirProdigle

I guess I should note I'm using bare metal university server 22.04, not WSL, if that helps identify the scope of the issue

Daxiongmao87 avatar Feb 02 '23 17:02 Daxiongmao87

I had the same issue as OP, also running on Linux. I could make it work by:

  • removing the venv directory
  • launching ./webui.sh --xformers, it reinstalled everything, including xformers. Now it works.

output:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
Commit hash: 226d840e84c5f306350b0681945989b86760e616
Installing torch and torchvision
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
	  Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl (1801.8 MB)
Collecting torchvision==0.14.1+cu117
  Using cached https://download.pytorch.org/whl/cu117/torchvision-0.14.1%2Bcu117-cp310-cp310-linux_x86_64.whl (24.3 MB)
Collecting typing-extensions
  Using cached typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting requests
  Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting numpy
  Using cached numpy-1.24.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Collecting pillow!=8.3.*,>=5.3.0
  Using cached Pillow-9.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
Collecting certifi>=2017.4.17
  Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting charset-normalizer<4,>=2
  Using cached charset_normalizer-3.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (198 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
Installing collected packages: charset-normalizer, urllib3, typing-extensions, pillow, numpy, idna, certifi, torch, requests, torchvision
Successfully installed certifi-2022.12.7 charset-normalizer-3.0.1 idna-3.4 numpy-1.24.1 pillow-9.4.0 requests-2.28.2 torch-1.13.1+cu117 torchvision-0.14.1+cu117 typing-extensions-4.4.0 urllib3-1.26.14
Installing gfpgan
Installing clip
Installing open_clip
Installing xformers
Installing requirements for CodeFormer
Installing requirements for Web UI

chrisburrc avatar Feb 03 '23 21:02 chrisburrc

I had the same issue as OP, also running on Linux. I could make it work by:

  • removing the venv directory
  • launching ./webui.sh --xformers, it reinstalled everything, including xformers. Now it works.

output:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
Commit hash: 226d840e84c5f306350b0681945989b86760e616
Installing torch and torchvision
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
	  Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl (1801.8 MB)
Collecting torchvision==0.14.1+cu117
  Using cached https://download.pytorch.org/whl/cu117/torchvision-0.14.1%2Bcu117-cp310-cp310-linux_x86_64.whl (24.3 MB)
Collecting typing-extensions
  Using cached typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting requests
  Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting numpy
  Using cached numpy-1.24.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Collecting pillow!=8.3.*,>=5.3.0
  Using cached Pillow-9.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
Collecting certifi>=2017.4.17
  Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting charset-normalizer<4,>=2
  Using cached charset_normalizer-3.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (198 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
Installing collected packages: charset-normalizer, urllib3, typing-extensions, pillow, numpy, idna, certifi, torch, requests, torchvision
Successfully installed certifi-2022.12.7 charset-normalizer-3.0.1 idna-3.4 numpy-1.24.1 pillow-9.4.0 requests-2.28.2 torch-1.13.1+cu117 torchvision-0.14.1+cu117 typing-extensions-4.4.0 urllib3-1.26.14
Installing gfpgan
Installing clip
Installing open_clip
Installing xformers
Installing requirements for CodeFormer
Installing requirements for Web UI

What commit were you using

Daxiongmao87 avatar Feb 03 '23 21:02 Daxiongmao87

For web-ui? I just did a fresh install today.

chrisburrc avatar Feb 03 '23 21:02 chrisburrc