stable-diffusion-webui-amdgpu-forge Double Speed for forcing fp32 with Zluda on flux

Anyone else noticed a speed boost while consuming more vram with this commandline args? Model: GGUF Q4 --use-zluda --attention-quad --all-in-fp32

Im about around 3s/it faster than with bfloat16

grafik

Sep 08 '24 20:09 DeeH0pe

Yes, you aren't the first one https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/1684

Sep 11 '24 23:09 mirh

You can use Q4? How?

Sep 13 '24 01:09 VeteranXT

You can use Q4? How?

Just put the Q4 in Stable-diffusion folder. Open launch.py and add these 2 lines and adjust your Path to the packages_3rdparty folder:

import sys sys.path.append(r"Your_Path\packages_3rdparty")

Sep 13 '24 14:09 DeeH0pe

You can use Q4? How?

Just put the Q4 in Stable-diffusion folder. Open launch.py and add these 2 lines and adjust your Path to the packages_3rdparty folder:

import sys sys.path.append(r"Your_Path\packages_3rdparty")

Hello, How is the sampler etc set up please, I am able to run it but all that comes out is a black picture.

Sep 20 '24 16:09 bigwinboy

Black picture may be https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/1278

Sep 20 '24 16:09 mirh

Black picture may be lllyasviel#1278

Thank you, successfully generated images after adding the command --all-in-fp32 to COMMANDLINE_ARGS= in webui-user.bat.

Sep 21 '24 00:09 bigwinboy