InvokeAI
InvokeAI copied to clipboard
[bug]: Crash when generating many images with Restore Face enabled
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
Linux
GPU
cuda
VRAM
8GB
What happened?
When using the web interface to generate lots of images ("lots" in this case is subjective as it seems to relate to the amount of VRAM) if you also enable the Restore Face option from the advanced options menu, InvokeAI will crash.
In my case I was generating 500 images with the Restore Face option enabled and at image 13 InvokeAI crashed with the message RuntimeError: CUDA out of memory.
If I disable the Restore Face option, I can generate seemingly as many images as I want - although I haven't tested with a batch size greater than 500 yet.
Here is the stack trace:
Traceback (most recent call last):
File "/mnt/media/stable-diffusion/InvokeAI/ldm/generate.py", line 467, in prompt2image
results = generator.generate(
File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/base.py", line 90, in generate
image = make_image(x_T)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/txt2img.py", line 57, in make_image
return self.sample_to_image(samples)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/invoke/generator/base.py", line 109, in sample_to_image
x_samples = self.model.decode_first_stage(samples)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/models/diffusion/ddpm.py", line 1118, in decode_first_stage
return self.first_stage_model.decode(z)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/models/autoencoder.py", line 421, in decode
dec = self.decoder(z)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/modules/diffusionmodules/model.py", line 613, in forward
h = self.up[i_level].block[i_block](h, temb)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/media/stable-diffusion/InvokeAI/ldm/modules/diffusionmodules/model.py", line 134, in forward
h = self.norm1(x)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 272, in forward
return F.group_norm(
File "/mnt/media/stable-diffusion/InvokeAI/.venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2516, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 7.79 GiB total capacity; 5.54 GiB already allocated; 295.62 MiB free; 6.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Screenshots
No response
Additional context
For additional context about the batch of images I was generating:
- Model: stable-diffusion-1.5
- Steps: 200
- CFG Scale: 15
- Width: 512
- Height: 768
- Sampler: k_euler
- Seed: <fixed, not random>
- Variation Amount: 0.4
- Restore Face Type: gfpgan
- Restore Face Strength: 0.8
- Upscale Scale: 4x
- Upscale Strength: 0.75
I also tested this with Upscale enabled and Restore Face disabled and did not experience a crash. As far as I can tell the issue only occurs with Restore Face enabled.
Lastly, I'm not sure which version of InvokeAI I'm using, but it's the one that was available 5-7 days before the latest version with Canvas was released.
Contact Details
No response
So, face restoration is extremely resource consuming, it is quite normal that for 500 images, and with only 8gb of available VRAM, all the memory is consumed. I don't know exactly how memory is used, but I believe the whole batch of images has to remain in memory before face restoration starts working.
Yes, but if I had specified 14 images it would have still failed at image 13. The problem doesn't seem to be related to the total number of images being generated, it seems to be more so that VRAM isn't freed after the Restore Face process completes. This quickly consumes all available VRAM within just a handful of images.
The whole batch remaining in memory doesn't seem to be the case, because I can generate 500 images, then go to each one and click the button to Restore Face, it will generate another image in my library, but it works and won't run out of VRAM. When I use the Restore Face in the batch process, it must work differently because it doesn't generate a new image in the library and runs out of VRAM.
There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.