stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

{Google Colab Stable Diffusion} [Bug]:OutOfMemoryError: CUDA out of memory

Open cloudywolf66 opened this issue 1 year ago • 4 comments

Checklist

  • [ ] The issue exists after disabling all extensions
  • [X] The issue exists on a clean installation of webui
  • [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • [X] The issue exists in the current version of the webui
  • [X] The issue has not been reported before recently
  • [ ] The issue has been reported before but has not been fixed yet

What happened?

Any attempts at generating a single picture above a certain resolution(currently anything over 1000x1000) results in this error. claiming to need over 5GB to generate the image. batch size and count is 1. Receiving this error for one example: OutOfMemoryError: CUDA out of memory. Tried to allocate 7.54 GiB. GPU 0 has a total capacity of 14.75 GiB of which 4.33 GiB is free. Process 25132 has 10.41 GiB memory in use. Of the allocated memory 9.93 GiB is allocated by PyTorch, and 353.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting

Steps to reproduce the problem

Turn AI on Attempt to generate an image with resolution greater than 1000x1000 No image generates, instead, receive Error

What should have happened?

Turn on AI Generate "anything" up to max size allowed on Google Colab Image Generates

What browsers do you use to access the UI ?

Mozilla Firefox

Sysinfo

sysinfo-2024-03-16-17-08.json

Console logs

unsure of how to provide full logs

Additional information

No response

cloudywolf66 avatar Mar 16 '24 17:03 cloudywolf66

Happens for me as well, with 512x512 image and added Hires. fix with 2x upscale:

OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacty of 14.75 GiB of which 3.90 GiB is free. Process 22191 has 10.85 GiB memory in use. Of the allocated memory 6.23 GiB is allocated by PyTorch, and 4.45 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Time taken: 4.6 sec.

CleanShot 2024-03-17 at 12 13 51@2x

orkenstein avatar Mar 17 '24 11:03 orkenstein

Full stack trace:

*** Error completing request
*** Arguments: ('task(cfivoaabdsa3lds)', <gradio.routes.Request object at 0x7bfef01d5f90>, 'malformed giant cyclops', '', [], 20, 'DPM++ 2M Karras', 1, 1, 7, 512, 512, True, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], 0, False, '', 0.8, -1, False, -1, 0, 0, 0, False, 'CodeFormer', False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "/content/stable-diffusion-webui/modules/call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "/content/stable-diffusion-webui/modules/call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "/content/stable-diffusion-webui/modules/txt2img.py", line 110, in txt2img
        processed = processing.process_images(p)
      File "/content/stable-diffusion-webui/modules/processing.py", line 785, in process_images
        res = process_images_inner(p)
      File "/content/stable-diffusion-webui/modules/processing.py", line 921, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "/content/stable-diffusion-webui/modules/processing.py", line 1273, in sample
        return self.sample_hr_pass(samples, decoded_samples, seeds, subseeds, subseed_strength, prompts)
      File "/content/stable-diffusion-webui/modules/processing.py", line 1358, in sample_hr_pass
        samples = self.sampler.sample_img2img(self, samples, noise, self.hr_c, self.hr_uc, steps=self.hr_second_pass_steps or self.steps, image_conditioning=image_conditioning)
      File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 188, in sample_img2img
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "/content/stable-diffusion-webui/modules/sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 188, in <lambda>
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/modules/sd_samplers_cfg_denoiser.py", line 237, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 18, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 32, in __call__
        return self.__orig_func(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/modules/sd_unet.py", line 91, in UNetModel_forward
        return original_forward(self, x, timesteps, context, *args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
        h = module(h, emb, context)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
        x = layer(x, context)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
        x = block(x, context=context[i])
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
        return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
        return CheckpointFunction.apply(func, len(inputs), *args)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply
        return super().apply(*args, **kwargs)  # type: ignore[misc]
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
        output_tensors = ctx.run_function(*ctx.input_tensors)
      File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward
        x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/content/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/content/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 268, in split_cross_attention_forward
        s2 = s1.softmax(dim=-1, dtype=q.dtype)
    torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB. GPU 0 has a total capacty of 14.75 GiB of which 3.90 GiB is free. Process 22191 has 10.85 GiB memory in use. Of the allocated memory 6.23 GiB is allocated by PyTorch, and 4.45 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

orkenstein avatar Mar 17 '24 11:03 orkenstein

I got the same as well, except torch was trying to allocate 18 GB for an image that previously generated perfectly fine. Trying to use sdp-no-mem-attention is causing way way more memory usage than before.

I have a fix for you two that you can use in the meantime, since I have no idea what broke it (likely the torch version Google is using or something, and I don't feel like checking version numbers for a million python packages):

switch from --opt-sdp-no-mem-attention to --opt-sdp-attention

that seemed to fix things for me

vt-idiot avatar Mar 25 '24 03:03 vt-idiot

when i changed the model,i got the same as well. changing setting sd_model_checkpoint to sd_xl_base_1.0.safetensors [31e35c80fc]: OutOfMemoryError and then i got another one :get_extra_networks error: 'NoneType' object has no attribute 'is_sdxl'

equinox-sun avatar Apr 19 '24 08:04 equinox-sun