stable-diffusion-webui
stable-diffusion-webui copied to clipboard
GPU memory is not freed
Every fuckin run my nvidia memory is not freed, it keeps being filled on 100% untill i fully restart restart webui. Are you joking, did you really forget to release memory after result? What is the problem to release video memory at the end? I cannot start next picture generation just because of RuntimeError: CUDA out of memory.
OF COURSE IT'S OUT OF MEMORY, BECAUSE YOU STILL KEEPING IT
PRs welcome
Every fuckin run my nvidia memory is not freed, it keeps being filled on 100% untill i fully restart restart webui. Are you joking, did you really forget to release memory after result? What is the problem to release video memory at the end? I cannot start next picture generation just because of RuntimeError: CUDA out of memory.
OF COURSE IT'S OUT OF MEMORY, BECAUSE YOU STILL KEEPING IT
If you can explain your problem without being rude we'll certainly try to help you. Nobody owes you anything.
seems like a driver issue to me , memory gets cleared just fine on my machine every run without restarting
I also had a message that I was out of GPU memory. The message also indicated much more memory was being allocated to pytorch, leaving little memory available for batch jobs. This would limit how many samples I could produce in a batch. However, I followed the instructions on this project's website to modify my web-user.bat file to include the line: set COMMANDLINE_ARGS=--opt-split-attention
I don't know if it was the right thing to do, but I haven't gotten the out of memory message since! This (awesome) program runs perfectly fine now for me with no need to restart. I have an RTX 3060.
There is another issue where users are claiming there is leak, but T can't reproduce it and they don't want to even elaborate
Sometimes I get a 2.0 GB VRAM spike near the end of a generation that can cause an OOM error. It doesn't always happen, and I've had it happen both on this UI and hlky/sd-webui at different times (just had it happen recently when testing the new attention code).
This has actually happened to me before but seemed random.
EDIT: I had another program open, which was a little GPU dependent (godot engine) and that seems to have caused the crash. When closing all GPU intense/using programs I can generate multiple batches in sequence without having to restart the webui. Maybe that is the reason for the crashes for others too.
For me it seems to happen regularly after the second time I try to generate a batch in one session.
I am running a NVIDIA GeForce GTX 1660 SUPER on Pop!_OS 22.04 LTS with the following command line args:
Launching Web UI with arguments: --precision full --no-half --medvram --opt-split-attention
Here is the error message:
txt2img: stone tiled road pencil drawing minimal sketch
Batch 1 out of 4: 100%|████████████| 20/20 [00:12<00:00, 1.64it/s]
Batch 2 out of 4: 0%| | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('stone tiled road pencil drawing minimal sketch', '', 'None', 'None', 20, 0, False, True, 4, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, False, 0.7, 0, False, False, None, '', 1, '', 4, '', True, False) {}
Traceback (most recent call last):
File "/home/antoniodell/stable-diffusion-webui/modules/ui.py", line 153, in f
res = list(func(*args, **kwargs))
File "/home/antoniodell/stable-diffusion-webui/webui.py", line 63, in f
res = func(*args, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/modules/txt2img.py", line 41, in txt2img
processed = process_images(p)
File "/home/antoniodell/stable-diffusion-webui/modules/processing.py", line 361, in process_images
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength)
File "/home/antoniodell/stable-diffusion-webui/modules/processing.py", line 469, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning)
File "/home/antoniodell/stable-diffusion-webui/modules/sd_samplers.py", line 320, in sample
samples = self.func(self.model_wrap_cfg, x, extra_args={'cond': conditioning, 'uncond': unconditional_conditioning, 'cond_scale': p.cfg_scale}, disable=False, callback=self.callback_state, **extra_params_kwargs)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 78, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/modules/sd_samplers.py", line 194, in forward
uncond = self.inner_model(x, sigma, cond=uncond)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 100, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 126, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 987, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
result = forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 1410, in forward
out = self.diffusion_model(x, t, context=cc)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 737, in forward
h = module(h, emb, context)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 85, in forward
x = layer(x, context)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 258, in forward
x = block(x, context=context)
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 209, in forward
return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
return CheckpointFunction.apply(func, len(inputs), *args)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 127, in forward
output_tensors = ctx.run_function(*ctx.input_tensors)
File "/home/antoniodell/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
x = self.attn1(self.norm1(x)) + x
File "/home/antoniodell/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/antoniodell/stable-diffusion-webui/modules/sd_hijack.py", line 91, in split_cross_attention_forward
s2 = s1.softmax(dim=-1, dtype=q.dtype)
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 5.80 GiB total capacity; 3.52 GiB already allocated; 252.81 MiB free; 3.86 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
@AUTOMATIC1111 I am sadly not very proficient in GPU programming or algorithms, but if you tell me how I can help you looking into this issue, I'd be glad to be of service.
Awesome project btw!
I also had a message that I was out of GPU memory. The message also indicated much more memory was being allocated to pytorch, leaving little memory available for batch jobs. This would limit how many samples I could produce in a batch. However, I followed the instructions on this project's website to modify my web-user.bat file to include the line: set COMMANDLINE_ARGS=--opt-split-attention
I don't know if it was the right thing to do, but I haven't gotten the out of memory message since! This (awesome) program runs perfectly fine now for me with no need to restart. I have an RTX 3060.
@whelanh Sadly using --opt-split-attention
does not seem to solve the issue for me.
Hi guys, I'm not sure if I have the exact same issue but whenever I choose a different model from UI and start generating the amount of batch size (and/or img_size) drops. It recovers when I relaunch the app. I'm new and not yet read the entire wiki&readme, so I might be just skipping a setting. I'm using windows 11, gpu nvidia rtx 3060 I'm on commit 685f963
I'm having the same problem on Linux, tho on the same machine on Windows, the problem seems non existent. When I run nvidia-smi
on Linux, it shows that python3 is still holding the memory for the previous generation. So when I restart the app, everything works again.
Awesome software btw! :heart:
i was having this issue on ubuntu today when generating lots of large (ish) frames in a row via the api. I'm not using batches or anything, using a gtx 1080 with 8gb VRAM
installing xformers and updating pytorch seems to have freed things up, not totally sure but I'm 10 frames deep and it seems like it's working
I would propose to close this issue. It is old, the version is long outdated and if people add to this without actual error information it is not helpful IMHO.
reopen if this is still an issue