stable-diffusion-webui-forge [Feature Request]: ZLUDA Support?

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Heyho,

currently i use a RTX3070 and i just ordered a RX 7900 XT. I know RocM is a thing but afaik it's not nearly as performant as CUDA?

So i found out about ZLUDA and that people got it working on A1111.

Did anyone try this on forge? I mean technically it should work just the same way as it does on A1111 right?

Proposed workflow

Not applicable

Additional information

No response

Feb 28 '24 22:02 RandomLegend

i was about to ask about the same thing, hope that @lllyasviel will look into it

Feb 29 '24 19:02 yacinesh

I'm playing with ZLUDA today, and will update my comment if/when I learn other relevant details.

Running Win11 x64 + 7900XTX w/ Radeon "Game" driver. (*1)

currently i use a RTX3070 and i just ordered a RX 7900 XT. I know ROCm is a thing but afaik it's not nearly as performant as CUDA?

Pretty much, from the benchmarks I've collected/seen/observed the performance hierarchy is roughly...

Linux-CUDA
Linux-ROCm ~=/ Win-Cuda (situational, so I'm calling it a tie)
Win-Zluda
Linux-OpenML
Win-OpenML
CPU

So i found out about ZLUDA and that people got it working on A1111. Yep. Can confirm a wild speedup in A1111 on Win from OpenML to Zluda - ballpark is around 30x faster (3,000% !). There's more variables that I care to isolate, but as a quick check over 30runs w/ batch size 4 & SD 1.5 the 7900xtx went from an avg of 2s/it to 15it/s.

Besides performance, ROCm (and maybe OpenML?) either doesn't do inpainting, or does it horribly. I watched so very many hours of inpainting tutorials to discover this last year. Zluda enables reliable inpainting!

Zluda also makes deterministic details the same - so if your working with something generated with nVidia hardware they'll actually look the same (or as "the same" as it can be, given the countless other variables that affect the output).

Did anyone try this on forge? I mean technically it should work just the same way as it does on A1111 right?

Wouldn't know where to start - but I'm happy to try. If we work backwards from the "lyqqshytiger A1111 DirectML" scripts associated with the "--use-zluda" parameter and replace the cublas64_11.dll & cusparse64_11.dll files with the zluda versions should be able to get most of the way to a solution

Mar 13 '24 03:03 joshaiken

Could you share what COMMANDLINE_ARGS you have set up for option 3) win-zluda? I changed from --use-directml to --use-zluda, but I get a 'RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check' (I don't get that with directml). Thanks!

Mar 16 '24 15:03 mongolsteppe

https://wikiwiki.jp/sd_toshiaki/%E3%82%B3%E3%83%A1%E3%83%B3%E3%83%88/Nvidia%E4%BB%A5%E5%A4%96%E3%81%AE%E3%82%B0%E3%83%A9%E3%83%9C%E3%81%AB%E9%96%A2%E3%81%97%E3%81%A6

I found this statement on this page helpful.

"I managed to run Forge with ZLUDA v3.5+7900XTX as follows: I used AnimagineXLV3 with Batch100, and it executed without any errors until the end, so I think it's relatively stable. I'll skip the details about setting the paths and environment variables since they're the same as SD.NEXT.

I ran webui.bat from Forge to start it, but it immediately shut down after starting. I reinstalled torch and torchvision: .\venv\Scripts\activate pip uninstall torch torchvision -y pip install torch==2.2.0 torchvision --index-url https://download.pytorch.org/whl/cu118

Then, I replaced cublas64_11.dll, cusparse64_11.dll, and nvrtc64_112_0.dll in venv\Lib\site-packages\torch\lib with the ones from ZLUDA.

In modules\initialize.py, under import torch, I added the following lines: torch.backends.cudnn.enabled = False torch.backends.cuda.enable_flash_sdp(False) torch.backends.cuda.enable_math_sdp(True) torch.backends.cuda.enable_mem_efficient_sdp(False)

That's how I did it."

Mar 20 '24 03:03 Bocchi-Chan2023

@Bocchi-Chan2023 I tried your instructions but I got this error

Mar 23 '24 13:03 yacinesh

@Bocchi-Chan2023 I tried your instructions but I got this error

.\venv\Scripts\activate pip uninstall torch torchvision -y pip install torch==2.2.0 torchvision --index-url https://download.pytorch.org/whl/cu118

Mar 23 '24 13:03 Bocchi-Chan2023

@Bocchi-Chan2023 yes i already done it . and the "cublas64_11.dll, cusparse64_11.dll, and nvrtc64_112_0.dll" files i copied them from sd next folder is that normal ?

Mar 23 '24 14:03 yacinesh

All these guides are for windows.

PATHS and libraries work a little bit different in Linux and i'd love to see someone making ZLUDA + Forge work on Linux.

Mar 23 '24 14:03 RandomLegend

@Bocchi-Chan2023 yes i already done it . and the "cublas64_11.dll, cusparse64_11.dll, and nvrtc64_112_0.dll" files i copied them from sd next folder is that normal ?

I think it's still possible, but my recommendation would be to rename and deploy binaries downloaded from the latest zluda release :)

Mar 23 '24 15:03 Bocchi-Chan2023

@Bocchi-Chan2023 yes i already done it . and the "cublas64_11.dll, cusparse64_11.dll, and nvrtc64_112_0.dll" files i copied them from sd next folder is that normal ?

I think it's still possible, but my recommendation would be to rename and deploy binaries downloaded from the latest zluda release :)

am i correct here ?

Mar 23 '24 15:03 yacinesh

Maybe @lshqqytiger could help out?

Mar 27 '24 08:03 Zaakh

@Bocchi-Chan2023 yes i already done it . and the "cublas64_11.dll, cusparse64_11.dll, and nvrtc64_112_0.dll" files i copied them from sd next folder is that normal ?

I think it's still possible, but my recommendation would be to rename and deploy binaries downloaded from the latest zluda release :)

am i correct here ?

yes

Mar 27 '24 10:03 Bocchi-Chan2023

Just to ask all of you, did you all get it working? because I did but it needed a couple more steps when installing and to get running - I don't want to fill this thread unless it's needed.

Apr 05 '24 00:04 Grey3016

@Grey3016 i did not, but again i am on Linux and the guides i found where for Windows.

I am not unsatisfied with the ROCm performance but i have no idea on what gains i am possibly missing out with ZLUDA.

Apr 05 '24 05:04 RandomLegend

@Grey3016 i did not, but again i am on Linux and the guides i found where for Windows.

I am not unsatisfied with the ROCm performance but i have no idea on what gains i am possibly missing out with ZLUDA.

You aren't. The only reason we're using ZLUDA in Windows is because we don't have ROCm in Windows... yet.

Apr 18 '24 21:04 brknsoul

Just to ask all of you, did you all get it working? because I did but it needed a couple more steps when installing and to get running - I don't want to fill this thread unless it's needed.

Would you be able to provide the extra steps you had to take? Thanks.

Apr 29 '24 12:04 beosliege

ZLUDA fork: https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu-forge launch with --zluda (optional) requirements Visual C++ Runtime ROCm 5.7

Apr 29 '24 12:04 lshqqytiger

ZLUDA fork: https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu-forge launch with --zluda (optional) requirements Visual C++ Runtime ROCm 5.7

i'm already trying to use your forked forge but i'm getting alot of errors, where i can report for issues ?

Apr 29 '24 12:04 yacinesh

I enabled issue feature

Apr 29 '24 13:04 lshqqytiger

I opened issue feature

i've managed to opened it finally but it failed to install insightface automatically, should i install it manually or leave it ?

Apr 29 '24 13:04 yacinesh

ignore if there isn't any issue (e.g. module not found)

Apr 29 '24 13:04 lshqqytiger

ZLUDA fork: https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu-forge launch with --zluda (optional) requirements Visual C++ Runtime ROCm 5.7

I could not start it in my environment. runtime and rocm are already installed. These are the errors I got:

Failed to install ZLUDA: 'Namespace' object has no attribute 'use_zluda_dnn'

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

File "C:\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\cuda_init_.py", line 284, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

May 01 '24 07:05 Bocchi-Chan2023

Could you share what COMMANDLINE_ARGS you have set up for option 3) win-zluda? I changed from --use-directml to --use-zluda, but I get a 'RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check' (I don't get that with directml). Thanks!

./webui.bat --use-zluda --listen --no-half-vae

May 01 '24 07:05 joshaiken

ZLUDA fork: https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu-forge launch with --zluda (optional) requirements Visual C++ Runtime ROCm 5.7

I could not start it in my environment. runtime and rocm are already installed. These are the errors I got:

Failed to install ZLUDA: 'Namespace' object has no attribute 'use_zluda_dnn'

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

File "C:\stable-diffusion-webui-amdgpu-forge\venv\lib\site-packages\torch\cuda_init_.py", line 284, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

Will fix

May 01 '24 08:05 lshqqytiger

stable-diffusion-webui-forge stable-diffusion-webui-forge copied to clipboard

[Feature Request]: ZLUDA Support?

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

stable-diffusion-webui-forge
stable-diffusion-webui-forge copied to clipboard