[Bug]: xformers either can't be installed or --force-enable-xformers decreases performance from 2s/it to 6/it with same prompt
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
I've followed various guides on how to build and install xformers (examples: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2449, https://www.reddit.com/r/StableDiffusion/comments/xz26lq/automatic1111_xformers_cross_attention_with_on/) but can't get it to work and do its thing, which is increase performance of image generation. I've searched for various ways and issues, e.g. #2270 and #2449 where @C43H66N12O12S2 and @Farfie posted various ideas for a workaround but they don't seem to work.
I've tried doing what @mezotaken suggested here #https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2270#issuecomment-1274927359 and inserted
if not is_installed("xformers"): print("1") if xformers: print("2") if platform.python_version().startswith("3.10"): print("3")
into launch.py. When I start webui-user.bat with --force-enable-xformers, only 3 is printed into console. But that can't be true, if I look into E:\Programs\stable-diffusion-webui\venv\Lib\site-packages I can see that xformers and xformers-0.0.14.dev0.dist-info folders exist. When I activate venv and do "pip list", I can also see xfomers 0.0.14.dev0 being listed. When I then start webui with --force-enable-xformers, it does seem to mention that it's applying (?) but the performance of generation drops sharply with this option enabled, from 2.5s/it to 6s/it using the same prompt and "DPM++ 2M Karras" as sampler
P.S. I see people often mention performance as it/s while my webui outputs s/it... is there some way to change and reverse it?
Steps to reproduce the problem
- Build xformers
- Install xformers
- Start SD with --force-enable-xformers
What should have happened?
- Console should'Ve printed 2 and 3 if xformers is installed
- Speed of image generation should've stayed same or hopefully decreased, not increased
Commit where the problem happens
804d9fb83d0c63ca3acd36378707ce47b8f12599
What platforms do you use to access UI ?
Windows
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
--medvram --listen --port 7860 --share --gradio-auth xxxx:xxxxx --gradio-img2img-tool color-sketch --force-enable-xformers
Additional information, context and logs
PC specs: Windows 10 16 GB RAM (+ 36GB virtual RAM) graphics card: Nvidia Geforce GTX 960 (4GB RAM, GPU chip: GM206, Architecture: Maxwell 2.0, CUDA: 5.2)
Installed packages: CUDA 11.8 torch-1.14.0.dev20221107+cu117.dist-info
Log from starting SD: Python 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)] Commit hash: 804d9fb83d0c63ca3acd36378707ce47b8f12599 3 Installing requirements for Web UI Launching Web UI with arguments: --medvram --listen --port 7860 --share --gradio-auth xxx:xxx --gradio-img2img-tool color-sketch --force-enable-xformers Error setting up CodeFormer: Traceback (most recent call last): File "E:\Programs\stable-diffusion-webui\modules\codeformer_model.py", line 35, in setup_model from modules.codeformer.codeformer_arch import CodeFormer ModuleNotFoundError: No module named 'modules.codeformer'
LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels Loading weights [19810fe6] from E:\Programs\stable-diffusion-webui\models\Stable-diffusion\merges\berrymix.ckpt Loading VAE weights from: E:\Programs\stable-diffusion-webui\models\Stable-diffusion\base\sd1.5.vae.pt Applying xformers cross attention optimization. Model loaded. 1680 1050 1000 Running on local URL: http://0.0.0.0:7860 Running on public URL: https://xxxxx.gradio.app
This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
I have also tried forced xformers on my 980M/GM204, with similar performance drop. It's possible that xformers does not work well on sm52, and caused the slowdown.
Closing as stale.