stable-diffusion.cpp
stable-diffusion.cpp copied to clipboard
[Feature Request] Enable Flash Attention in the released binary
When I tried out the released binaries, I was surprised they do not have flash attention enabled. I mean these:
sd-master-ce1bcc7-bin-win-avx-x64.zip
sd-master-ce1bcc7-bin-win-avx2-x64.zip
sd-master-ce1bcc7-bin-win-avx512-x64.zip
The readme mentions this about flash attention:
Enabling flash attention reduces memory usage by at least 400 MB. At the moment, it is not supported when CUBLAS is enabled because the kernel implementation is missing.
The "bin-win-avx" CPU-only binaries don't have CUBLAS, so it shouldn't be any issue to enable flash attention, right?
When I then tried to compile the project myself with flash attention enabled, I also noticed that flash attention does not seem to work without manually disabling an assert (https://github.com/leejet/stable-diffusion.cpp/issues/138).