stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

CUDA runs out of memory with lots of memory reserved

Open eeveeishpowered opened this issue 2 years ago • 11 comments

I'm trying to run the text-to-image model with the example but CUDA keeps running out of memory, despite it barely trying to allocate anything. It's trying to allocate 20MB when there's 7.3GB reserved. Is there any way to fix this? I've searched all over but I couldn't find a clear answer.

eeveeishpowered avatar Aug 07 '22 01:08 eeveeishpowered

Having the same problem RuntimeError: CUDA out of memory. Tried to allocate 114.00 MiB (GPU 0; 8.00 GiB total capacity; 7.14 GiB already allocated; 0 bytes free; 7.24 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

litevex avatar Aug 07 '22 12:08 litevex

The same model (latent diffusion 1.6B) does run on 8 GB when using Jack000/glid-3-xl, so it is supposed to work.

litevex avatar Aug 07 '22 12:08 litevex

I'm having the same problem it allocates tons of memory and then fails

num421337 avatar Aug 08 '22 01:08 num421337

Also runs out of VRAM on a 16 GB P100, so something is definitely wrong. This did not happen with the same model on the latent-diffusion repo RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.90 GiB total capacity; 15.07 GiB already allocated; 21.75 MiB free; 15.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

litevex avatar Aug 08 '22 07:08 litevex

It appears I did have the problem on the latent-diffusion repo as well, but I fixed my problem on both by adding os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:4096" near the start of text2img.py

litevex avatar Aug 08 '22 08:08 litevex

It appears I did have the problem on the latent-diffusion repo as well, but I fixed my problem on both by adding os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:4096" near the start of text2img.py

where exactly should I put it? I tried pasting it in and still got the error granted I have less gpu memory

num421337 avatar Aug 08 '22 09:08 num421337

It appears I did have the problem on the latent-diffusion repo as well, but I fixed my problem on both by adding os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:4096" near the start of text2img.py

where exactly should I put it? I tried pasting it in and still got the error granted I have less gpu memory

What GPU are you using? Maybe it doesn't have enough VRAM

litevex avatar Aug 08 '22 10:08 litevex

entirely possible it's a 6gb GTX 1060. not horrible but not new or anything

num421337 avatar Aug 08 '22 10:08 num421337

entirely possible it's a 6gb GTX 1060. not horrible but not new or anything

6 GB isn't enough for the large latent diffusion model (which isn't the actual stable diffusion model) . It might be enough for the stable diffusion model once it releases which is half the size.

litevex avatar Aug 08 '22 10:08 litevex

I'm still getting the error, even with PYTORCH_CUDA_ALLOC_CONF set.

RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 11.77 GiB total capacity; 8.62 GiB already allocated; 678.19 MiB free; 8.74 GiB reserved in total by PyTorch)

GeForce RTX 3060, 12GB of VRAM

Bleyddyn avatar Aug 23 '22 18:08 Bleyddyn

I solved this by setting --n_samples to 1 and using --n_iter if I wanted more than one output. Found at: https://www.assemblyai.com/blog/how-to-run-stable-diffusion-locally-to-generate-images/

Bleyddyn avatar Aug 25 '22 20:08 Bleyddyn