InvokeAI
InvokeAI copied to clipboard
[bug]: OOM when upscaling with SD1.5 (SDXL works)
Is there an existing issue for this problem?
- [X] I have searched the existing issues
Operating system
Linux
GPU vendor
AMD (ROCm)
GPU model
RX 6700 XT
GPU VRAM
12GB
Version number
5.0.0
Browser
Firefox 130.0.1
Python dependencies
{ "accelerate": "0.30.1", "compel": "2.0.2", "cuda": null, "diffusers": "0.27.2", "numpy": "1.26.4", "opencv": "4.9.0.80", "onnx": "1.15.0", "pillow": "10.4.0", "python": "3.10.13", "torch": "2.2.2+rocm5.7", "torchvision": "0.17.2+rocm5.7", "transformers": "4.41.1", "xformers": null }
What happened
When attempting to upscale an image using any SD1.5 models, the application crashes as it runs out of VRAM.
What you expected to happen
Expected the image to be upscaled.
How to reproduce the problem
Select an image, select an SD1.5 model, try upscaling.
Additional context
Testing image base resolution is 1800 x 1024 The upscaler is the standard RealESRGAN_x2plus, installed through the model manager. Upscaling output was marked as 2x (the lowest setting).
Using an SDXL model, the upscaling completes successfully (though takes a few minutes)
When observing system monitoring tools, there appear to be 2 main steps that primarily use the GPU. I assume these to be the generator model pass, and then the upscaler pass, though correct me if I'm wrong. With the SD1.5 attempts, the first step seems to finish (a relatively small amount of VRAM is allocated, processing occurs, then this is deallocated). The second step rapidly ramps up the VRAM usage, crashing almost immediately.
May possibly be related to https://github.com/invoke-ai/InvokeAI/issues/6301
Discord username
cubethethird
I can concur that I am having the exact same issue (SD-1.5 upscaling fails for running out of VRAM, SDXL upscales fine) - same card, same OS platform. Was having the issue on 4.2.7 and upgraded to 5.0.0 but the issue still persists.
Best workaround for me right now is to generate the image using SD-1.5, and then use a similarish SDXL model to do the upscaling with.
Still occurring in Invoke 5.6.0
Still occurring in Invoke 5.7.2
Still occurring in Invoke 5.9.0
@psychedelicious - I anticipate the graph is missing tiled decode
The graphs are identical for the VAE decoding, and the same decode node & node settings are used for both. Tiling and tile size are the same.
Here is my VRAM usage for two upscaling runs:
The overall memory trend is similar, but SDXL needs more VRAM overall. Makes sense - it is a bigger model.
However, SD1.5's decoding spike is greater in absolute value than SDXL's spike.
I don't think this is an issue w/ graphs or node settings. But I don't understand how the VAE works well enough to troubleshoot - only make observations.
Still an issue in Invoke 5.10.1
Still an issue in Invoke 5.11
Still an issue in Invoke 5.12
Still an issue in Invoke 6.0.0
https://discord.com/channels/1020123559063990373/1149510134058471514/1393395097458184242
@hipsterusername and I tested changing the tiledMultidiffusion from 1024 to 512 and was able to upscale the generation.
My 30Gb AMD card could not handle tiling at 1024, so this is a huge consumer of vram.
@hipsterusername said:
Think we may just need to expose this as a frontend setting - it’s strange that it’s taking up so much vram but that may just be a consequence of being optimized cuda vs rocm
I can confirm in Invoke 6.2.0 that, while the default tile options do result in OOM, setting it down much lower (in the 500s) does allow for the upscale to work. I did however get a broken output with 512 tiles and 64 overlap, where one of the tiles came out black on the final output.