diffusers
diffusers copied to clipboard
Analysis of Classifier-Free Guidance Weight Schedulers
Model/Pipeline/Scheduler description
The paper's authors performs an analysis and proposes 1 line change in order to make Classifier-Free Guidance looks better
I personally run some test to confirm
SD1.5 DDIM scheduler, 50 steps, "a photograph of an astronaut riding a horse", seed: 1024
guidance scale: 7.5
static
linear (proposed)
guidance scale: 14.0
static
linear (proposed)
Open source status
- [ ] The model implementation is available.
- [ ] The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
Paper: https://arxiv.org/abs/2404.13040
Update SDXL result
guidance scale: 14.0
static
linear
cc: @yiyixuxu and @asomoza for visibility.
what a neat way to make use of the knowledge already in the model!
cc @asomoza can we make a callback for this?
yes but there's a lot of techniques about manipulating the CFG, most of them without papers, I added the cutout one because I know it's really popular, makes the generations faster, and as a kind of example on how to manipulate the CFG
Maybe we should let the community add these ones later on? Other more popular ones are automatic cfg and Dynamic Thresholding
Hi, interested on your work. Is there an explanation of WHY this phenomena happens?
i'm not an expert on this, but on a cursory glance it seems to be basing the strength of guidance by the position in the timestep schedule. this also likely works because there's two types of attention being used by the model, with earlier timesteps being cross-attn (heavily relying on text conditional input) and later timesteps being self-attn (practically ignoring the prompt)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.