InvokeAI [bug]: Crash when generating many images with Restore Face enabled

Is there an existing issue for this?

[X] I have searched the existing issues

OS

Linux

GPU

cuda

VRAM

8GB

What happened?

When using the web interface to generate lots of images ("lots" in this case is subjective as it seems to relate to the amount of VRAM) if you also enable the Restore Face option from the advanced options menu, InvokeAI will crash.

In my case I was generating 500 images with the Restore Face option enabled and at image 13 InvokeAI crashed with the message RuntimeError: CUDA out of memory.

If I disable the Restore Face option, I can generate seemingly as many images as I want - although I haven't tested with a batch size greater than 500 yet.

Here is the stack trace:

Traceback (most recent call last):
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/generate.py", line 467, in prompt2image
    results = generator.generate(
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/base.py", line 90, in generate
    image = make_image(x_T)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/txt2img.py", line 57, in make_image
    return self.sample_to_image(samples)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/base.py", line 109, in sample_to_image
    x_samples = self.model.decode_first_stage(samples)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/models/diffusion/ddpm.py", line 1118, in decode_first_stage
    return self.first_stage_model.decode(z)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/models/autoencoder.py", line 421, in decode
    dec = self.decoder(z)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/modules/diffusionmodules/model.py", line 613, in forward
    h = self.up[i_level].block[i_block](h, temb)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/media/stable-diffusion/InvokeAI/ldm/modules/diffusionmodules/model.py", line 134, in forward
    h = self.norm1(x)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 272, in forward
    return F.group_norm(
  File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2516, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 7.79 GiB total capacity; 5.54 GiB already allocated; 295.62 MiB free; 6.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Screenshots

No response

Additional context

For additional context about the batch of images I was generating:

Model: stable-diffusion-1.5
Steps: 200
CFG Scale: 15
Width: 512
Height: 768
Sampler: k_euler
Seed: <fixed, not random>
Variation Amount: 0.4
Restore Face Type: gfpgan
Restore Face Strength: 0.8
Upscale Scale: 4x
Upscale Strength: 0.75

I also tested this with Upscale enabled and Restore Face disabled and did not experience a crash. As far as I can tell the issue only occurs with Restore Face enabled.

Lastly, I'm not sure which version of InvokeAI I'm using, but it's the one that was available 5-7 days before the latest version with Canvas was released.

Contact Details

No response

Dec 09 '22 16:12 ek-dyoder

So, face restoration is extremely resource consuming, it is quite normal that for 500 images, and with only 8gb of available VRAM, all the memory is consumed. I don't know exactly how memory is used, but I believe the whole batch of images has to remain in memory before face restoration starts working.

Jan 03 '23 14:01 kriptolix

Yes, but if I had specified 14 images it would have still failed at image 13. The problem doesn't seem to be related to the total number of images being generated, it seems to be more so that VRAM isn't freed after the Restore Face process completes. This quickly consumes all available VRAM within just a handful of images.

The whole batch remaining in memory doesn't seem to be the case, because I can generate 500 images, then go to each one and click the button to Restore Face, it will generate another image in my library, but it works and won't run out of VRAM. When I use the Restore Face in the batch process, it must work differently because it doesn't generate a new image in the library and runs out of VRAM.

Jan 03 '23 17:01 ek-dyoder

There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.

Mar 15 '23 06:03 github-actions[bot]

InvokeAI InvokeAI copied to clipboard

[bug]: Crash when generating many images with Restore Face enabled

Is there an existing issue for this?

OS

GPU

VRAM

What happened?

Screenshots

Additional context

Contact Details

InvokeAI
InvokeAI copied to clipboard