stable-diffusion.cpp
stable-diffusion.cpp copied to clipboard
FreeU support
FreeU is a method that seems to give somewhat more impressive results which can be used with any sampling method and SD model. A lot of artifacts are fixed, text is rendered somewhat better and the colors are somewhat less murky.
I imagine this would help counteract the "murkiness" of SDXL images' colors (SDXL turbo looks alright using euler_a, but becomes murkier again with lcm sampler). LCM images seem to be especially murky as well.
There are 2 versions of FreeU in ComfyUI. FreeU_v2, which is a newer version of the method seems to be the best. Here is a comparison: https://www.reddit.com/r/comfyui/comments/17l5spc/comparison_of_freeu_v2_with_old_freeu/
Paper: https://arxiv.org/abs/2309.11497 Repository: https://github.com/ChenyangSi/FreeU Project Page: https://chenyangsi.top/FreeU/ Demo: https://huggingface.co/spaces/ChenyangSi/FreeU
FreeU_v2 seems to be a paper update: https://twitter.com/scy994/status/1714499568573039102
Interesting, I'll see if I can give it some time to at least try if it works.
Bad new:
To implement this, it would be necessary to create the operation for the fourier transform and its inverse in ggml, and that is very difficult.
I gave a try but I am not sure if implemented correctly. Here are the comparison without and with FreeU.
| FreeU OFF | FreeU ON |
|---|---|
Above ones are all using artiusV21_v21.safetensors file. Below comparison switched to v2-1_768-nonema-pruned.safetensors.
| FreeU OFF | FreeU ON |
|---|---|
FreeU definitely won here 😸 .
I gave a try but I am not sure if implemented correctly.
@bssrdf Do you mean in sd.cpp? I checked your fork, and I don't see anything related.
I gave a try but I am not sure if implemented correctly.
@bssrdf Do you mean in sd.cpp? I checked your fork, and I don't see anything related.
Yes. Please check out add-freeu-support branch of my sd.cpp fork as well as add-freeu-support branch of my ggml fork. Most of the work is done in ggml. Sorry, code is a bit messy now and I will do a clean up before the PR.
@bssrdf Great job, to implement that functionality in sd.cpp for obvious reasons, it must also be compatible with the CPU backend. I don't know much about how to implement the FFT optimally, so I can't help you with that. I'm also interested to see if you could replicate those results with sd-webui or with the implementation from the repository to check if they match or are similar. Additionally, it would be helpful to determine if certain values passed to the Fourier_filter function are the same as those you receive with your implementation.
def Fourier_filter(x, threshold, scale):
# FFT
x_freq = fft.fftn(x, dim=(-2, -1))
x_freq = fft.fftshift(x_freq, dim=(-2, -1))
B, C, H, W = x_freq.shape
mask = torch.ones((B, C, H, W)).cuda()
crow, ccol = H // 2, W //2
mask[..., crow - threshold:crow + threshold, ccol - threshold:ccol + threshold] = scale
x_freq = x_freq * mask
# IFFT
x_freq = fft.ifftshift(x_freq, dim=(-2, -1))
x_filtered = fft.ifftn(x_freq, dim=(-2, -1)).real
I gave a try but I am not sure if implemented correctly.
@bssrdf Do you mean in sd.cpp? I checked your fork, and I don't see anything related.
Yes. Please check out add-freeu-support branch of my sd.cpp fork as well as add-freeu-support branch of my ggml fork. Most of the work is done in
ggml. Sorry, code is a bit messy now and I will do a clean up before the PR.
Great job! Can you create a PR for both ggml and sd.cpp?
@FSSRepo, @leejet, thanks. There is still quite some work before I can make a PR.
- The changes to
sd.cppare relatively trivial but they do require some interface mods (I think it is better to make a switch for FreeU on or off). - The
ggmlmods are more substantial, but still manageable. The problem is the two operators (FFT_FILTER and FREEU_BACKBONE) I added for FreeU are too specific and may not get accepted. I see there are some ggml extensions in sd.cpp but I don't see how cuda backends can be added. I am seeking your advice. - The two additional operators are only implemented for GPU backend and I have to work on their CPU counterparts.
@FSSRepo, I can verify that the two operators are functioning correctly as I have test cases for both. But I don't think I can replicate results with sd-webui or with the implementation from the original author's repository.