stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

help ! RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 10.92 GiB total capacity; 8.62 GiB already allocated; 1.39 GiB free; 8.81 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Open chinatian opened this issue 2 years ago • 44 comments

error message:

RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 10.92 GiB total capacity; 8.62 GiB already allocated; 1.39 GiB free; 8.81 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

i have GeForce GTX 1080 Ti[11G]

chinatian avatar Aug 25 '22 05:08 chinatian

You ran out of GPU memory. Try using nvitop to monitor your gpu memory usage. You can try this branch: https://github.com/basujindal/stable-diffusion It trades speed for a lower memory usage.

F0rt1s avatar Aug 25 '22 06:08 F0rt1s

You can also just reduce the width and height of the output by using the parameters --H and --W

Naxter avatar Aug 25 '22 07:08 Naxter

Also, try using --n_samples 1.

chyld avatar Aug 26 '22 15:08 chyld

just use this https://huggingface.co/spaces/stabilityai/stable-diffusion

breadbrowser avatar Aug 27 '22 01:08 breadbrowser

I've a Titan V 12Gb only worked with @chyld tip:

python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms --n_samples 1

grid-0002

Changing --H 256 --W 256 the result where poor grid-0000 grid-0001

titusfx avatar Aug 27 '22 17:08 titusfx

Have same issue on GPU with 12GB VRAM. Just turned model to float16 precision. scripts/txt2img.py, function - load_model_from_config, line - 63, change from: model.cuda() to model.cuda().half()

xmvlad avatar Aug 29 '22 13:08 xmvlad

Have same issue on GPU with 12GB VRAM. Just turned model to float16 precision. scripts/txt2img.py, function - load_model_from_config, line - 63, change from: model.cuda() to model.cuda().half()

@xmvlad would you say the quality had been reduced a lot?

JustinGuese avatar Aug 30 '22 13:08 JustinGuese

And yes I basically would say you will at least need 12GB VRAM

JustinGuese avatar Aug 30 '22 13:08 JustinGuese

it helps removing the sfw filter as the model takes ~2GB VRAM just disable the lines or use my txt2img.py https://github.com/JustinGuese/stable-diffusor-docker-text2image/blob/master/txt2img.py

JustinGuese avatar Aug 30 '22 13:08 JustinGuese

you could also disable the watermark, but it does not use as much vram

JustinGuese avatar Aug 30 '22 13:08 JustinGuese

would you say the quality had been reduced a lot?

No, results was almost the same(checked some prompts from web).

xmvlad avatar Aug 30 '22 14:08 xmvlad

I have the same issue on Windows 10:

RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 8.00 GiB total capacity; 5.62 GiB already allocated; 0 bytes free; 5.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It's not a low memory issue, it's a NO memory issue (after a PC restart) because Pytorch is possibly taking too much. Any way to reduce what it allocates?

rjamesnw avatar Sep 02 '22 16:09 rjamesnw

Seems this post did help to reduce the total reserved size of Pytorch: https://github.com/CompVis/stable-diffusion/issues/86#issuecomment-1230309710

I think Windows is allocating some for itself (tried closing all apps and still over 3gb is already allocated), and using that post's solution helps, along with reducing the hight and width, and samples: https://github.com/CompVis/stable-diffusion/issues/86#issuecomment-1228617044

rjamesnw avatar Sep 03 '22 12:09 rjamesnw

Faced the same issues. Things that worked for me.

  1. Load the half-model as suggested by @xmvlad here.
  2. Disabling safety checker and invisible watermarking
  3. reducing number of samples to 1 (--n_samples 1)
  4. reducing the height and weight to 256. This severely affects the quality of the output.

kiranscaria avatar Sep 03 '22 13:09 kiranscaria

loading half-model sucessfull fix error at txt2img. But when I try do the same at img2img I got new error:

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

looks like I need to do halving with something else...

phatal avatar Sep 03 '22 17:09 phatal

@phatal

Replace line 54 of scripts/img2img.py with

image = np.array(image).astype(np.float16) / 255.0

And also make sure that your input picture has a dimension of 512x512. Compression rate does not matter.

That worked for me.

lthiet avatar Sep 04 '22 01:09 lthiet

PyTorch is still taking a lot of memory, and it seems a lot of other GPU memory is taken up by something else while the command runs because the resource monitor shows very little utilized until the command runs. Is 8GB to low for a GPU for this system? I can only make 384x384 work at the most, but would like a higher res image if possible. I already implemented the ideas above (reduce samples, and half the model), but 512 fails:

> python scripts/txt2img.py --prompt "flying pig" --H 512 --W 512 --seed 27 --n_iter 1 --ddim_steps 100

RuntimeError: CUDA out of memory. Tried to allocate 3.00 GiB (GPU 0; 8.00 GiB total capacity; 3.65 GiB already allocated; 1.18 GiB free; 4.30 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

3.65 GiB already allocated, but it is not allocated before the command runs - especially after a restart and shutting down almost all processes using the GPU (except Windows obviously).

rjamesnw avatar Sep 05 '22 03:09 rjamesnw

@rjamesnw After using the half precision model, have the GPU consumption to peak to ~12-13GB. To lower the GPU consumption further you can refer Issue: #95 You can also look at repos targeting smaller VRAM footprint like: https://github.com/SkylarKelty/stable-diffusion

kiranscaria avatar Sep 05 '22 03:09 kiranscaria

This worked for me https://constant.meiring.nz/playing/2022/08/04/playing-with-stable-diffusion.html

hkiang01 avatar Sep 08 '22 06:09 hkiang01

I managed to get it to work with rtx 2060 that has only 6GB VRAM using lower resolution. Make sure to add model.to(torch.float16) in load_model_from_config function, just before model.cuda() is called. If it's still not enough, change the resolution. For me 384x384 works well but I also experimented with 256x768 and 320x704 which still produce good quality if you give it the right prompt.

09-09-2022_000931 seed-2688872 step-20 eta-0-35_768x256-0009 00214 00118

dyanechi avatar Sep 10 '22 20:09 dyanechi

I'm getting this error for img2img on an RTX 3090 on Ubuntu.

RuntimeError: CUDA out of memory. Tried to allocate 26.11 GiB (GPU 0; 23.70 GiB total capacity; 4.31 GiB already allocated; 16.35 GiB free; 5.03 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Using --n_samples 1 did not help.

What worked for me was resizing the source image and converting it to png, as follows:

convert source.jpg -resize 512x512 source.png

Hope this helps someone.

drfinkus avatar Sep 12 '22 17:09 drfinkus

it helps removing the sfw filter as the model takes ~2GB VRAM just disable the lines or use my txt2img.py https://github.com/JustinGuese/stable-diffusor-docker-text2image/blob/master/txt2img.py

I wont ask why you conviniently have a txt2img with the sfw filter removed.....

JAsaxon avatar Sep 18 '22 00:09 JAsaxon

@dyanechi that worked for me. Needed to add --n_samples 1 (follow https://github.com/CompVis/stable-diffusion/issues/86#issuecomment-1236122289), but now I don't need to scale down, thanks.

rjamesnw avatar Sep 18 '22 04:09 rjamesnw

RuntimeError: CUDA out of memory. Tried to allocate 2.15 GiB (GPU 0; 8.00 GiB total capacity; 6.26 GiB already allocated; 0 bytes free; 6.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF :( immpossible command!!!!

mary5050 avatar Dec 11 '22 20:12 mary5050

Same thing here, there is tons of available memory, but it keeps throwing that reserved memory is larger than allocated memory...

Bec-k avatar Jan 06 '23 14:01 Bec-k

Can you explain for dummies where and how I do this? thanks

Florencia007 avatar Feb 18 '23 12:02 Florencia007

it helps removing the sfw filter as the model takes ~2GB VRAM just disable the lines or use my txt2img.py https://github.com/JustinGuese/stable-diffusor-docker-text2image/blob/master/txt2img.py

I don't knowwhere I put this file and what do I need to change after to load it. Can you explain it but for dummies like me. LOL

Florencia007 avatar Feb 18 '23 12:02 Florencia007

@phatal

Replace line 54 of scripts/img2img.py with

image = np.array(image).astype(np.float16) / 255.0

And also make sure that your input picture has a dimension of 512x512. Compression rate does not matter.

That worked for me.

i dont know where to replace this line which file , i am new help plz

MiguelPunkUchi avatar Feb 19 '23 16:02 MiguelPunkUchi

no, it is very small. I don't like that size 512

El dom, 19 feb 2023 a las 13:25, MiguelPunkUchi @.***>) escribió:

@phatal https://github.com/phatal

Replace line 54 of scripts/img2img.py with

image = np.array(image).astype(np.float16) / 255.0

And also make sure that your input picture has a dimension of 512x512. Compression rate does not matter.

That worked for me.

i dont know where to replace this line which file , i am new help plz

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/86#issuecomment-1436031392, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALFUMKFPI7QF5KGXEBJV4S3WYJCO7ANCNFSM57RVRCCA . You are receiving this because you commented.Message ID: @.***>

mary5050 avatar Feb 19 '23 19:02 mary5050

I am trying for NLP and getting similar error on my NVIDIA MX350

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 2.00 GiB total capacity; 1.63 GiB already allocated; 0 bytes free; 1.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

nishchalkarwade avatar Mar 03 '23 19:03 nishchalkarwade