stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: Add an option to use old implementation of UniPC/DDIM

Open 20of opened this issue 1 year ago • 22 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Adds an option to settings/samplers to use old implementation of CFG denoiser

Proposed workflow

  1. Go to settings
  2. Press samplers
  3. Enable old implementation of CFG denoiser for DDIM, PLMS and UniPC

Additional information

Requesting this since new update broke all previous gens and made these samplers unusable (at least for me), haven't managed to get even close to previous detail/quality. Old implementation (1.5.2) allowed very high CFG values for DDIM and UniPC while maintaining good quality. As in 1.6, all samplers seem to deepfry images with higher CFG values. And no, using low CFG values did not fix this, nor did anything else I tried.

20of avatar Sep 04 '23 22:09 20of

I did quick test with DDIM in 1.6.0 and 1.5.2

1.5.2 1.6.0
20230905-084547-869888-3486974419 20230905-083813-461964-3486974419

w-e-w avatar Sep 04 '23 23:09 w-e-w

I had the same issue, especially when using img2img (edit: inpainting). I had to revert back to 1.5.2. as DDIM sampler is unusable for me now.

gshawn3 avatar Sep 07 '23 06:09 gshawn3

I third this issue thank you for posting hope we can figure it out and get it implemented!

nostepme avatar Sep 07 '23 06:09 nostepme

In my tests the behavior is identical. Please take the time to show examples with embedded metadata, like w-e-w did, of the previous and current behavior if you truly believe it's not functional. Nothing can be done on our part if we can't reproduce the issue you're facing.

catboxanon avatar Sep 07 '23 15:09 catboxanon

can you try 10-15 steps, <9 CFG and maybe a doubly complicated high token prompt w/ more than two Loras ?

nostepme avatar Sep 07 '23 20:09 nostepme

@20of @gshawn3 @nostepme you wouldn't happen to be using Seshelle's CFG Rescale extension? It doesn't work with the updated DDIM and UniPC implementations and it sounds like you're seeing the unscaled outputs. Other samplers work fine though. image

feffy380 avatar Sep 10 '23 09:09 feffy380

Hi @feffy380 @catboxanon @w-e-w @20of ;

Sorry for late reply! had to get both installed and do an sfw prompt lol

Anyway, using: xformers; no token merge; padded prompt - yes; persistent cond - yes; batch cond/uncond (1.6.0) - yes; <---tried unchecking this since I didn't see in 1.5.2 but same result

These are only things different than default Auto1111 settings, at least in 1.5.2. Think it's default in 1.6.0; No extensions or scripts used; 1080ti (tried old drivers and did not change result) Python 3.10.11 - have always used no problem, could reproduce old seeds etc. no problem until this update.

For me, it's generally less detail, color and prompt adherence in 6.0 vs 5.2;

Here is example just using ( Juggernaut - Final ) and the cat prompt and seed from ( Juggernaut - Aftermath ) model page - included generation info of course and yes same VAE:

[ 1.5.2 ]

152-3115824003

[ 1.6.0 ]

160-3115824003

Let me know if you'd like me to try a specific model or look out for other specific settings/changes etc.

v/r

nostepme avatar Sep 23 '23 20:09 nostepme

@20of @gshawn3 @souki202 can you post example?

nostepme avatar Sep 24 '23 02:09 nostepme

I am not sure if this is the same phenomenon, but the same sampler is not able to remove noise properly with SDXL. It is normal with SD1.5.

RTX 3090, 16step, cfg7.0, xformers, ToMe 0.2 Negative Guidance minimum sigma: 1 Sampler Parameters is the default setting CFG Rescale extension not used xyz_grid-0001-1

It is easy to see in the blue background, but DDIM and UniPC leave a hazy noise in the image if you look closely.

40 steps, cfg 7.0

xyz_grid-0002-1

souki202 avatar Sep 25 '23 23:09 souki202

@souki202

Hmm, interesting and still could be related. Maybe it's only discernable to the naked eye in SDXL since higher res. If you can, would you mind trying one of your new seeds with version 1.5.2 and as identical settings as possible?

With my old DDIM seeds, the new implementation is to your point; 'hazy' in that the final result has less detail/color/contrast. This could definitely be related to denoise efficiency or scale or something because I notice the 'haze' you mention manifests as discernably less contrast.

Seems UniPC here has same if not more poppy color/contrast than DPM++ but less coherent?

nostepme avatar Sep 26 '23 21:09 nostepme

@souki202

also, i know you won't be able to test your sdxl seeds against this in 1.5.2 but it would still be helpful if you're able

unfortunately it seems models other than sdxl are the only way we can version test and compare. i'm a bit too novice to determine what actually changed in the implementation and also maybe how it translates to comfy nodes but i'll try.

nostepme avatar Sep 26 '23 21:09 nostepme

@nostepme

UniPC has funny noise in the image above as well as DDIM As you say, Unfortunately, DDIM, PLMS, and UniPC are not available in SDXL in version 1.5.2 of the UI, so no comparison is possible. I am not a native English speaker, so it is difficult for me to describe in words the noise like color irregularities that remain in the image, but I suspect that the noise was not removed successfully and the noise is remaining halfway through the image.

For reference, I will place images generated with v1.5.2 and 1.6 using SD1.x models. There is a slight difference in the images, but it is only a minor difference in settings due to the difference in versions. In the case of the sd1.x model, there is almost no difference, so I think there must be a bug in the part adjusted specifically for SDXL.

v1.5.2 DDIM, cfg7.0 v1 52xyz_grid-0004-9

v1.6 v1 6xyz_grid-0001-9

souki202 avatar Sep 27 '23 03:09 souki202

@souki202

Hey thanks for posting, and yes I noticed the funny noise/detail difference in UniPC as well. Although, seems like it strangely has more poppy color and contrast/dynamic range otherwise, even compared to DPM++? thinking face

Anyway, here is difference for last one you posted. (6.52%)

6 52 percent change

ONLYDIFF

nostepme avatar Sep 27 '23 16:09 nostepme

@nostepme

Hmm, I don't see any significant difference between the two. In 1.6, there is an additional setting in Optimizations, so there may be a difference as well.

UniPC and Restart are samplers that tend to be saturated and contrasty in nature. DDIM is a bit more subdued in saturation than DPM++.

souki202 avatar Sep 28 '23 00:09 souki202

@souki202

Yea I did try unchecking the "batch cond/uncond (1.6.0)" in my example above but it made no visible difference. I'll have to do this test on that as well.

But, my point is there is a difference and at higher resolution/steps may not result in something visible to the naked eye (SD1.5), but still a difference nonetheless. With iterative results/processes or lower steps/cfg etc, this minute difference can compound into a drastic change in the result; or maybe just noise in the case of SDXL

Now just need to figure out if it is from:

nondeterministic results; settings; change/error in the new DDIM/etc implementation

Also, @w-e-w (second post) actually got no technical difference in result...I wonder what graphics card?

nostepme avatar Sep 28 '23 12:09 nostepme

Oops, totally forgot about this, sorry. I'm positive that problem lies within vpred models, Seshelle's cfg rescale and changed denoising implementation.

Here are examples, you can probs guess which is which, but they are identical in png info's eyes. 00210-3048865886 00002-3048865886

20of avatar Sep 29 '23 03:09 20of

@20of @souki202 @feffy380 @catboxanon

I'm having to lookup what vpred model is but sounds legitimate; not sure if Juggernaut falls in this category but it is affected by whatever as well according my example post.

Also, I wonder if it has something to do with scheduler?

I noticed in comfyUI there is a 'DDIM_uniform' scheduler; as opposed to normal, simple, exponential, karras, and sgm_uniform...

and since they are automatically selected in combination with the ksamplers in auto1111 I wonder...

nostepme avatar Sep 29 '23 16:09 nostepme

watch this just be a cpu/gpu noise thing along with a maybe a mix of the cfg scaling/scheduler.

eh but looks like it's missed its window of exposure

goodbye old DDIM and seeds, guess ill move on (T___T) maybe it's a good thing

v/r

nostepme avatar Oct 03 '23 02:10 nostepme

The difference is easily detectable when using DDIM for img2img2.

The new DDIM produces results not much different from other samplers, while the old DDIM produces markedly different results, which is not surprising.

To add it back along with the new implementation is trivial, I have added it back for myself by reverting some code in a couple of files. The old implementation was simply removed and it is simple matter of adding back the old code to restore it.

ec111 avatar Oct 03 '23 10:10 ec111

The difference is easily detectable when using DDIM for img2img2.

The new DDIM produces results not much different from other samplers, while the old DDIM produces markedly different results, which is not surprising.

To add it back along with the new implementation is trivial, I have added it back for myself by reverting some code in a couple of files. The old implementation was simply removed and it is simple matter of adding back the old code to restore it.

Could you make a pull request?

ArnoldDCoy avatar Oct 14 '23 10:10 ArnoldDCoy

The difference is easily detectable when using DDIM for img2img2. The new DDIM produces results not much different from other samplers, while the old DDIM produces markedly different results, which is not surprising. To add it back along with the new implementation is trivial, I have added it back for myself by reverting some code in a couple of files. The old implementation was simply removed and it is simple matter of adding back the old code to restore it.

Could you make a pull request?

https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/13643

New implementation of DDIM required a workaround to be compatible with non-updated extensions such as TiledDiffusion.

However, that code interfered with the old samplers so I removed it.

As a result, the new implementations won't run with my code, but old ones (the ones I restored) will.

ec111 avatar Oct 15 '23 00:10 ec111

After some research I found out that the noticeable changes happen when cond and uncond have different lengths (as in when prompt is <=75 tokens and neg prompt is >75 tokens). There is a weird kind of padding I did to make DDIM code work with this, and when using 1.6.0's implementation, there is no longer need for it, so things function the same way they do for kdiffusion samplers.

Now, some people, it seems, found this kind of padding desirable. So my solution is to add an option to use this padding with any sampler. In dev branch, you can find it in Settings -> Stable Diffusion -> Optimizations -> Pad prompt/negative prompt (v0). It's also automatically enabled when reading infotexts from pics with specified version below 1.6.0 and DDIM/PLMS sampler.

With this, you can reproduce the robot cat picture by just reading infotext from it (although you'll need Pad prompt/negative prompt optimization disabled because it overrides Pad prompt/negative prompt (v0)).

firefox_lKp7l2xnbF

AUTOMATIC1111 avatar Jan 27 '24 19:01 AUTOMATIC1111

After some research I found out that the noticeable changes happen when cond and uncond have different lengths (as in when prompt is <=75 tokens and neg prompt is >75 tokens). There is a weird kind of padding I did to make DDIM code work with this, and when using 1.6.0's implementation, there is no longer need for it, so things function the same way they do for kdiffusion samplers.

Now, some people, it seems, found this kind of padding desirable. So my solution is to add an option to use this padding with any sampler. In dev branch, you can find it in Settings -> Stable Diffusion -> Optimizations -> Pad prompt/negative prompt (v0). It's also automatically enabled when reading infotexts from pics with specified version below 1.6.0 and DDIM/PLMS sampler.

With this, you can reproduce the robot cat picture by just reading infotext from it (although you'll need Pad prompt/negative prompt optimization disabled because it overrides Pad prompt/negative prompt (v0)).

firefox_lKp7l2xnbF

So cool, I'm honored you sorted and implemented this; maybe it's a nice buffer for an overkill/disparate cond/uncond prompter and of course helps to regenerate pre-1.6.0 gens. Thank you again, looking forward to testing here in a bit

<3

@20of @souki202 @feffy380 @gshawn3 @ec111

nostepme avatar Jan 30 '24 18:01 nostepme