stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: Zero denoising fails with ZeroDivisionError for img2img

Open derek-upham opened this issue 2 years ago • 11 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

The VanillaStableDiffusionSampler has two calls to its adjust_steps_if_invalid method.

In sample_img2img (line 253):

        steps = self.adjust_steps_if_invalid(p, steps)

In sample (line 274):

        steps = self.adjust_steps_if_invalid(p, steps or p.steps)

Notice that the sample call uses steps if that is truthy, or p.steps otherwise. The sample_img2img call does not have that fallback expression.

The calling code does not pass an explicit steps parameter; the method declaration instead uses the default None value. That means the code passes the None value to adjust_steps_if_invalid, triggering ZeroDivisionError when Python converts the None to integer zero.

(The KDiffusionSampler class does not have adjust_steps_if_invalid at all, and avoids the bug.)

Steps to reproduce the problem

Send any image through img2img using the DDIM sampler and look for the stack trace in the console output.

What should have happened?

The sample_img2img code should probably use the same protective fallback expression as sample.

Commit where the problem happens

86359535d6fb0899fa9e838d27f2006b929331d5

What platforms do you use to access UI ?

Linux

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

No response

Additional information, context and logs

No response

derek-upham avatar Jan 17 '23 03:01 derek-upham

I can't reproduce your issues, DDIM runs just fine in img2img for me, no stack trace error etc. I also can't find any files you mentioned. Are we talking the standard sd webui here?

mclsugi avatar Jan 17 '23 10:01 mclsugi

Standard webui, with --api, --listen, --server-name options.

The processing.py module has two places that call sample_img2img. In the StableDiffusionProcessingTxt2Img class, sample method, line 866:

        samples = self.sampler.sample_img2img(self, samples, noise, conditioning, unconditional_conditioning, steps=self.hr_second_pass_steps or self.steps, image_conditioning=image_conditioning)

In the StableDiffusionProcessingImg2Img class, sample method, line 1011:

        samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)

That second one lacks the explicit steps keyword parameter.

In the samplers themselves, the sample_img2img method is consistent in the method signature (sd_samplers.py, lines 253 and 492 for the two classes):

    def sample_img2img(self, p, x, noise, conditioning, unconditional_conditioning, steps=None, image_conditioning=None):

That means the StableDiffusionProcessingImg2Img code path is legal, and will end up passing in None. Then my local patch up at line 253 provides the fallback.

     def sample_img2img(self, p, x, noise, conditioning, unconditional_conditioning, steps=None, image_conditioning=None):
         steps, t_enc = setup_img2img_steps(p, steps)
-        steps = self.adjust_steps_if_invalid(p, steps)
+        steps = self.adjust_steps_if_invalid(p, steps or p.steps)
         self.initialize(p)

Without that, adjust_steps_if_invalid fails:

    def adjust_steps_if_invalid(self, p, num_steps):
        if (self.config.name == 'DDIM' and p.ddim_discretize == 'uniform') or (self.config.name == 'PLMS'):
            valid_step = 999 / (1000 // num_steps)
            if valid_step == floor(valid_step):
                return int(valid_step) + 1

Maybe the differenct in our test environments is that you're running with quad for the discretization?

The other possibility is in setup_img2img_steps.

def setup_img2img_steps(p, steps=None):
    if opts.img2img_fix_steps or steps is not None:
        requested_steps = (steps or p.steps)
        steps = int(requested_steps / min(p.denoising_strength, 0.999)) if p.denoising_strength > 0 else 0
        t_enc = requested_steps - 1
    else:
        steps = p.steps
        t_enc = int(min(p.denoising_strength, 0.999) * steps)

    return steps, t_enc

The "else" block should have the same outcode as my patch, and use p.steps. That suggests that the code path is going through the "true" block and this is some sort of math error. In which case the different behavior is related to the "fix steps" option. (I'm guessing that the option up there is the "do exactly the amount of steps the slider specifies" checkbox" in the settings.)

I'll back out my local patch and add some trace prints, then run try the different discretization and "fix steps" values.

derek-upham avatar Jan 17 '23 15:01 derek-upham

The ZeroDivisionError error, from before any local patches, was this:

  File "/extra/ArtGenerators/stable-diffusion-webui/modules/processing.py", line 1011, in sample
    samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/sd_samplers.py", line 255, in sample_img2img
    steps = self.adjust_steps_if_invalid(p, steps)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/sd_samplers.py", line 247, in adjust_steps_if_invalid
    valid_step = 999 / (1000 // num_steps)
ZeroDivisionError: integer division or modulo by zero

That "integer division" means that the failure is in the num_steps value to //.

derek-upham avatar Jan 17 '23 15:01 derek-upham

And now I can't reproduce it either when I remove my patch, with any combination of the two test parameters. Just...lovely. Okay, I'll close the issue, and if I do trigger it again I'll trace the heck out of the call path and reopen.

derek-upham avatar Jan 17 '23 16:01 derek-upham

ZeroDivisionError: integer division or modulo by zero , My favorite methods of destroying something 25 years ago. [:nostalgia]

Maybe the differenct in our test environments is that you're running with quad for the discretization?

Try both. DDIM is my favorite sampler beside DPM++ 2M, so I checked what is up. Thank's for the rundown, I've learned a lot.

mclsugi avatar Jan 17 '23 16:01 mclsugi

Success. And by "success", I mean "failure".

Traceback (most recent call last):
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/call_queue.py", line 45, in f
    res = list(func(*args, **kwargs))
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/call_queue.py", line 28, in f
    res = func(*args, **kwargs)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/img2img.py", line 146, in img2img
    processed = modules.scripts.scripts_img2img.run(p, *args)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/scripts.py", line 337, in run
    processed = script.run(p, *script_args)
  File "/extra/ArtGenerators/stable-diffusion-webui/scripts/xy_grid.py", line 435, in run
    processed = draw_xy_grid(
  File "/extra/ArtGenerators/stable-diffusion-webui/scripts/xy_grid.py", line 227, in draw_xy_grid
    processed:Processed = cell(x, y)
  File "/extra/ArtGenerators/stable-diffusion-webui/scripts/xy_grid.py", line 413, in cell
    res = process_images(pc)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/processing.py", line 479, in process_images
    res = process_images_inner(p)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/processing.py", line 608, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/processing.py", line 1011, in sample
    samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/sd_samplers.py", line 255, in sample_img2img
    steps = self.adjust_steps_if_invalid(p, steps)
  File "/extra/ArtGenerators/stable-diffusion-webui/modules/sd_samplers.py", line 247, in adjust_steps_if_invalid
    valid_step = 999 / (1000 // num_steps)
ZeroDivisionError: integer division or modulo by zero

I'll start adding tracing.

derek-upham avatar Jan 17 '23 16:01 derek-upham

I ended up putting this PDB invocation into the wrap_gradio_call method, and confirmed that it enters the debugger (with a trial 1 // 0 expression in the try block). The next time it turns up, I should be able to walk the stack.

        try:
            res = list(func(*args, **kwargs))
        except Exception as e:
            if e.__class__ == ZeroDivisionError:
                import pdb; pdb.post_mortem()

            # When printing out our debug argument list, do not print out more than a MB of text
            max_debug_str_len = 131072 # (1024*1024)/8

derek-upham avatar Jan 18 '23 03:01 derek-upham

Here we go.

def setup_img2img_steps(p, steps=None):
    if opts.img2img_fix_steps or steps is not None:
        requested_steps = (steps or p.steps)
        steps = int(requested_steps / min(p.denoising_strength, 0.999)) if p.denoising_strength > 0 else 0
        t_enc = requested_steps - 1
    else:
        steps = p.steps
        t_enc = int(min(p.denoising_strength, 0.999) * steps)

    return steps, t_enc

Let's say that we asked for a 0.0 denoising strength. (Because we want to see the denoising progression in an X/Y plot.) And let's say that we have enabled the "reduce number of steps based on denoising level" feature. In that case p.denoising_strength zero and this method returns zero steps (thanks to that trailing else value).

  File "/extra/ArtGenerators/stable-diffusion-webui/modules/sd_samplers.py", line 247, in adjust_steps_if_invalid
    valid_step = 999 / (1000 // num_steps)
ZeroDivisionError: integer division or modulo by zero

Divide-by-zero in the very next method.

derek-upham avatar Jan 20 '23 06:01 derek-upham

I tweaked the bug title to reflect the problem.

derek-upham avatar Jan 20 '23 06:01 derek-upham

The bug is still only relevant for PLMS, or for DDIM with "uniform" discretization.

    def adjust_steps_if_invalid(self, p, num_steps):
        if (self.config.name == 'DDIM' and p.ddim_discretize == 'uniform') or (self.config.name == 'PLMS'):
            valid_step = 999 / (1000 // num_steps)
            if valid_step == floor(valid_step):
                return int(valid_step) + 1
        
        return num_steps

I think we should just skip the whole special block when num_steps is 0, and return 0. Even DDIM is happy with 0 steps, with "quad" discretization.

Note that returning zero is essentially what would happen if we were to move the num_steps division into the dividend: valid_step = 999 * num_steps / 1000. (Yes, I know that doing so would lose the integer division.)

derek-upham avatar Jan 20 '23 06:01 derek-upham

I was running into the same bug when calling the API using enable_hr (which really should be hr_enabled so that it follows the logical pattern of the other hr_ parameters) with the PLMS sampler.

Skipping the block or changing the valid_step calculation didn't work for me, as it just pushed the bug further down the execution chain -- the bug was a symptom of another issue.

I tracked down the bug to setup_img2img_steps in sd_samplers_common, specifically min(p.denoising_strength, 0.999)) -- I wasn't explicitly passing a denoising strength in my request. When I did, it started working. Seems like there's a missing default value somewhere earlier in the request.

Here's my request body which does work: { "prompt": "Testing 123", "sampler_name": "PLMS", "n_iter": 1, "steps": 35, "cfg_scale": 9.5, "width": 640, "height": 640, "enable_hr": true, "hr_upscaler": "Latent", "hr_second_pass_steps": 30, "hr_resize_x": 1280, "hr_resize_y": 1280, "denoising_strength": 0.7, }

theseamusjames avatar Feb 28 '23 19:02 theseamusjames

DDIM, PLMS, and UniPC were reworked as part of https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/8285a149d8c488ae6c7a566eb85fb5e825145464. Open a new issue or re-open this one if the problem still persists as of the latest dev branch commit (or once a new release comes out).

catboxanon avatar Aug 12 '23 07:08 catboxanon