stable-diffusion-webui
stable-diffusion-webui copied to clipboard
Low weights such as (word:0.1) don't do what you'd expect
It's either a problem with documenting how numeric weights work or how weights are handled, as in, do they have some sort of minimum influence and how does it work?
Lowering weights from 1.0 initially does what you'd expect but by ~0.6 and below it seems to diminish the effect of the keyword less than expected, and you'd expect a weight of 0 to remove all influence of the keyword, yet that's not the case, a keyword with a weight of 0 still clearly has a lot of influence on the prompt, which feels wrong and is inconvenient when trying to control how subtle the influence of a keyword is, let alone create smooth transitions between not having a keyword and having its full effect. It's also unclear what negative weights do, for instance I was able to bring back a person's arms only through using (arms:-0.8), using (arms:0.8) wasn't the same, yet (arms:-2) produces as much gibberish as (arms:2).
Speaking of weights of 2, I also wish I knew why things go so crazy when weights reach around 2. Clearly there must be some formula applied to weights, something like tan()
that makes a weight of 2 have an insane image-breaking influence, it would be nice to document what's going on there as it's not at all what one would intuitively expect a weight of 2 to do.
I think its just how much a given subject is overfit inside weights, so try with other subject or repeat the weighting command again, with some animals i have to rpeate their name multiple times to get things work, i expect this is the same case Not all things are equally trained , that would be quite accomplishment
I think its just how much a given subject is overfit inside weights, so try with other subject or repeat the weighting command again, with some animals i have to rpeate their name multiple times to get things work, i expect this is the same case Not all things are equally trained , that would be quite accomplishment
That's not really what I'm asking about, I'm asking about numerical weights.
You can use the prompt editing/scheduler syntax [from:to:when].
For something major that needs to take shape, use from with low when. For example, [dress::0.2]. Something that's already in the picture will likely stay in the picture, and if it keeps being applied, then it won't be editable. Hence the low when.
For something minor that edits/alters something, use to. For example, [:clothes removed:0.3]. Some prompts has strong effect, some other has weak effect. The when should be low (early) or high (late) depending on it.
You can also combine the two to put something in the middle of rendering. For example, [[:sobbing:0.3]::0.5] will be added from step 30% to 50%. I found this somehow necessary. When applied near the end, tears won't be a separate object and instead alters the eye shape. Meanwhile if it's applied in the beginning, it affects the picture a lot but may hardly take shape, so instead it adds some obscure skin/eye colored bulge.
If it's not strong enough, you can extend the application steps through when, or you can also add weights to it with (prompt: weight).
I find it odd that this wouldn't be considered a bug given that the desirable behaviour is for weights of 0 to have the same effect as not having the keyword at all. As it is there's no way to smoothly add a keyword.
It is not easily supportable as just including the word will affect the output. Having zero attention (weight) for a token means that the model does not consider it important for a specific image feature. However, the token being present still affects the overall structure and context of the image. Zero attention and the token being absent are simply not the same. There is no way to do this without invasive modifications to CLIP and the Stable Diffusion Model architectures.
Feel free to reopen this if you want, but I doubt you will see any more responses as this is a very old thread, and is not really possible.