stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

[Feature Request] Enable Flash Attention in the released binary

Open JohnAlcatraz opened this issue 9 months ago • 0 comments

When I tried out the released binaries, I was surprised they do not have flash attention enabled. I mean these:

sd-master-ce1bcc7-bin-win-avx-x64.zip

sd-master-ce1bcc7-bin-win-avx2-x64.zip

sd-master-ce1bcc7-bin-win-avx512-x64.zip

The readme mentions this about flash attention:

Enabling flash attention reduces memory usage by at least 400 MB. At the moment, it is not supported when CUBLAS is enabled because the kernel implementation is missing.

The "bin-win-avx" CPU-only binaries don't have CUBLAS, so it shouldn't be any issue to enable flash attention, right?

When I then tried to compile the project myself with flash attention enabled, I also noticed that flash attention does not seem to work without manually disabling an assert (https://github.com/leejet/stable-diffusion.cpp/issues/138).

JohnAlcatraz avatar May 11 '24 21:05 JohnAlcatraz