stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

CUDA out of memory though I still got a lot of free VRAM

Open PBoy20511 opened this issue 1 year ago • 5 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

Im running stable diffusion on my 4090, but this keeps showing up torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 23.99 GiB total capacity; 4.82 GiB already allocated; 16.48 GiB free; 4.92 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

though I still have 16.48 GiB free, it keep showing cuda out of memory. Do anyone know what the problem is?

Steps to reproduce the problem

  1. Go to ....
  2. Press ....
  3. ...

What should have happened?

It shouldn’t have cuda out of memory

Commit where the problem happens

Not sure

What platforms do you use to access the UI ?

No response

What browsers do you use to access the UI ?

No response

Command Line Arguments

I use website-user.bat

List of extensions

Lora

Console logs

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 23.99 GiB total capacity; 4.82 GiB already allocated; 16.48 GiB free; 4.92 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Additional information

No response

PBoy20511 avatar Apr 06 '23 05:04 PBoy20511

What did you tried to do? Textual Inversion training? Normal image generation?

Trung0246 avatar Apr 06 '23 06:04 Trung0246

try setting this as a system variable on Environmental Variables: Variable name= PYTHON_CUDA_ALLOC_CONF Variable value= max_split_size_mb:1024

crappypatty avatar Apr 10 '23 13:04 crappypatty

I have same problem

50mkw avatar Apr 13 '23 13:04 50mkw

try setting this as a system variable on Environmental Variables: Variable name= PYTHON_CUDA_ALLOC_CONF Variable value= max_split_size_mb:1024

it's not working for this case, seems some where system limite pytorch to reserved more memory.

50mkw avatar Apr 13 '23 13:04 50mkw

try setting this as a system variable on Environmental Variables: Variable name= PYTHON_CUDA_ALLOC_CONF Variable value= max_split_size_mb:1024

it's not working for this case, seems some where system limite pytorch to reserved more memory.

I miswrote the variable name lol. It is actually PYTORCH_CUDA_ALLOC_CONF. I didn't notice it.

crappypatty avatar Apr 15 '23 13:04 crappypatty

Check the output of nvidia-smi, see if there's any zombie process and kill them if so. This usually happens when a client is still requesting something while the server was shutdown.

tankwyn avatar Apr 25 '23 08:04 tankwyn

I got the similar error. My GPU is NVIDIA GeForce RTX 3050 8Gb (7.6Gb Available. 4Gb for Dedicated GPU and 3.6 Gb for shared GPU). When I run the stable diffusion model pretrained pipe line, all of memory are utilized and no memory error occured. But when I run it from step by step, I got out of memory error in the step: unet.to(torch_device).

I referenced the notebook from https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb

Below is the version of modules I used !pip install diffusers==0.10.0 !pip install huggingface-hub>=0.11.1 !pip install transformers==4.25.1 !pip install ftfy==6.1.1 !pip install accelerate==0.15.0

My cuda version is 12.1 and detail info is below image

minkhantDoctoral avatar Jul 27 '23 10:07 minkhantDoctoral