stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

GPU memory is not freed

Open goshante opened this issue 2 years ago • 10 comments

Every fuckin run my nvidia memory is not freed, it keeps being filled on 100% untill i fully restart restart webui. Are you joking, did you really forget to release memory after result? What is the problem to release video memory at the end? I cannot start next picture generation just because of RuntimeError: CUDA out of memory.

OF COURSE IT'S OUT OF MEMORY, BECAUSE YOU STILL KEEPING IT

goshante avatar Sep 10 '22 16:09 goshante

PRs welcome

AUTOMATIC1111 avatar Sep 10 '22 17:09 AUTOMATIC1111

Every fuckin run my nvidia memory is not freed, it keeps being filled on 100% untill i fully restart restart webui. Are you joking, did you really forget to release memory after result? What is the problem to release video memory at the end? I cannot start next picture generation just because of RuntimeError: CUDA out of memory.

OF COURSE IT'S OUT OF MEMORY, BECAUSE YOU STILL KEEPING IT

If you can explain your problem without being rude we'll certainly try to help you. Nobody owes you anything.

orionaskatu avatar Sep 10 '22 17:09 orionaskatu

seems like a driver issue to me , memory gets cleared just fine on my machine every run without restarting

ChinatsuHS avatar Sep 10 '22 18:09 ChinatsuHS

I also had a message that I was out of GPU memory. The message also indicated much more memory was being allocated to pytorch, leaving little memory available for batch jobs. This would limit how many samples I could produce in a batch. However, I followed the instructions on this project's website to modify my web-user.bat file to include the line: set COMMANDLINE_ARGS=--opt-split-attention

I don't know if it was the right thing to do, but I haven't gotten the out of memory message since! This (awesome) program runs perfectly fine now for me with no need to restart. I have an RTX 3060.

whelanh avatar Sep 10 '22 18:09 whelanh

There is another issue where users are claiming there is leak, but T can't reproduce it and they don't want to even elaborate

AUTOMATIC1111 avatar Sep 10 '22 21:09 AUTOMATIC1111

Sometimes I get a 2.0 GB VRAM spike near the end of a generation that can cause an OOM error. It doesn't always happen, and I've had it happen both on this UI and hlky/sd-webui at different times (just had it happen recently when testing the new attention code).

TheEnhas avatar Sep 10 '22 22:09 TheEnhas

This has actually happened to me before but seemed random.

bbecausereasonss avatar Sep 10 '22 23:09 bbecausereasonss

EDIT: I had another program open, which was a little GPU dependent (godot engine) and that seems to have caused the crash. When closing all GPU intense/using programs I can generate multiple batches in sequence without having to restart the webui. Maybe that is the reason for the crashes for others too.


For me it seems to happen regularly after the second time I try to generate a batch in one session. I am running a NVIDIA GeForce GTX 1660 SUPER on Pop!_OS 22.04 LTS with the following command line args: Launching Web UI with arguments: --precision full --no-half --medvram --opt-split-attention

Here is the error message:

txt2img: stone tiled road pencil drawing minimal sketch
Batch 1 out of 4: 100%|████████████| 20/20 [00:12<00:00,  1.64it/s]
Batch 2 out of 4:   0%|                     | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('stone tiled road pencil drawing minimal sketch', '', 'None', 'None', 20, 0, False, True, 4, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, False, 0.7, 0, False, False, None, '', 1, '', 4, '', True, False) {}
Traceback (most recent call last):
  File "/home/antoniodell/stable-diffusion-webui/modules/ui.py", line 153, in f
    res = list(func(*args, **kwargs))
  File "/home/antoniodell/stable-diffusion-webui/webui.py", line 63, in f
    res = func(*args, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/modules/txt2img.py", line 41, in txt2img
    processed = process_images(p)
  File "/home/antoniodell/stable-diffusion-webui/modules/processing.py", line 361, in process_images
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength)
  File "/home/antoniodell/stable-diffusion-webui/modules/processing.py", line 469, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning)
  File "/home/antoniodell/stable-diffusion-webui/modules/sd_samplers.py", line 320, in sample
    samples = self.func(self.model_wrap_cfg, x, extra_args={'cond': conditioning, 'uncond': unconditional_conditioning, 'cond_scale': p.cfg_scale}, disable=False, callback=self.callback_state, **extra_params_kwargs)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 78, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/modules/sd_samplers.py", line 194, in forward
    uncond = self.inner_model(x, sigma, cond=uncond)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 100, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 126, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 987, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 1410, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 737, in forward
    h = module(h, emb, context)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 85, in forward
    x = layer(x, context)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 258, in forward
    x = block(x, context=context)
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 209, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 127, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/antoniodell/stable-diffusion-webui/modules/sd_hijack.py", line 91, in split_cross_attention_forward
    s2 = s1.softmax(dim=-1, dtype=q.dtype)
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 5.80 GiB total capacity; 3.52 GiB already allocated; 252.81 MiB free; 3.86 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@AUTOMATIC1111 I am sadly not very proficient in GPU programming or algorithms, but if you tell me how I can help you looking into this issue, I'd be glad to be of service.

Awesome project btw!

I also had a message that I was out of GPU memory. The message also indicated much more memory was being allocated to pytorch, leaving little memory available for batch jobs. This would limit how many samples I could produce in a batch. However, I followed the instructions on this project's website to modify my web-user.bat file to include the line: set COMMANDLINE_ARGS=--opt-split-attention

I don't know if it was the right thing to do, but I haven't gotten the out of memory message since! This (awesome) program runs perfectly fine now for me with no need to restart. I have an RTX 3060.

@whelanh Sadly using --opt-split-attention does not seem to solve the issue for me.

AntonioDell avatar Sep 30 '22 06:09 AntonioDell

Hi guys, I'm not sure if I have the exact same issue but whenever I choose a different model from UI and start generating the amount of batch size (and/or img_size) drops. It recovers when I relaunch the app. I'm new and not yet read the entire wiki&readme, so I might be just skipping a setting. I'm using windows 11, gpu nvidia rtx 3060 I'm on commit 685f963

berkaykarlik avatar Dec 18 '22 15:12 berkaykarlik

I'm having the same problem on Linux, tho on the same machine on Windows, the problem seems non existent. When I run nvidia-smi on Linux, it shows that python3 is still holding the memory for the previous generation. So when I restart the app, everything works again. Awesome software btw! :heart:

giriss avatar Mar 23 '23 13:03 giriss

i was having this issue on ubuntu today when generating lots of large (ish) frames in a row via the api. I'm not using batches or anything, using a gtx 1080 with 8gb VRAM

installing xformers and updating pytorch seems to have freed things up, not totally sure but I'm 10 frames deep and it seems like it's working

SlimeQ avatar May 23 '23 06:05 SlimeQ

I would propose to close this issue. It is old, the version is long outdated and if people add to this without actual error information it is not helpful IMHO.

TheOnlyHolyMoly avatar Jun 02 '23 20:06 TheOnlyHolyMoly

reopen if this is still an issue

w-e-w avatar Jun 18 '23 08:06 w-e-w