stable-diffusion-webui Textual Inversion CUDA errors (mentions hypernetwork)

Describe the bug I was able to test out / use Textual Inversion 2 or 3 days ago. Suddenly I run into CUDA errors, even when I am trying to train on different models.

Traceback (most recent call last):
  File "F:\StableDiffusion\stable-diffusion-webui\modules\ui.py", line 158, in f
    res = list(func(*args, **kwargs))
  File "F:\StableDiffusion\stable-diffusion-webui\webui.py", line 65, in f
    res = func(*args, **kwargs)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\textual_inversion\ui.py", line 29, in train_embedding
    embedding, filename = modules.textual_inversion.textual_inversion.train_embedding(*args)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 223, in train_embedding
    loss.backward()
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\function.py", line 253, in apply
    return user_fn(self, *args)
  File "F:\StableDiffusion\stable-diffusion-webui\repositories\stable-diffusion\ldm\modules\diffusionmodules\util.py", line 138, in backward
    output_tensors = ctx.run_function(*shallow_copies)
  File "F:\StableDiffusion\stable-diffusion-webui\repositories\stable-diffusion\ldm\modules\attention.py", line 243, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\hypernetwork.py", line 75, in attention_CrossAttention_forward
    sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 6.00 GiB total capacity; 4.94 GiB already allocated; 0 bytes free; 5.10 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Desktop (please complete the following information):

OS: Windows 10
Commit revision 87db6f01cc6b118fe0c82c36c6686d72d060c417

Additional context Last time I used it was before the implementation of Hyper Networks.

Oct 08 '22 07:10 Rartist

Yep, I'm getting CUDA out of memory now while trying to use Textual Inversion. Previously ran without issues. FYI I was previously using Textual Inversion with the WebGUI with 6GB of VRAM.

Oct 08 '22 09:10 mediocreatmybest

As a quick follow up, rolled back to commit 2995107fa24cfd72b0a991e18271dcde148c2807 and looks to be working fine.

Oct 08 '22 10:10 mediocreatmybest

As a quick follow up, rolled back to commit 2995107 and looks to be working fine.

I tried that, but can't confirm. Neither with base 1_4 nor a merged CKPT.

Oct 09 '22 08:10 Rartist

Thought I'd check again as it looks like quite a bit of additional work is being done since the Hypernetwork inclusion. Yep, my configuration is Windows 10, Standard Stable Diffusion Model 1.4. No Hypernetwork files/folders.

RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 6.00 GiB total capacity; 4.58 GiB already allocated; 0 bytes free; 4.84 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

In the console errors out just after applying cross attention optimization.

Preparing dataset... 100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:07<00:00, 6.50it/s] 0%| | 0/1000 [00:00<?, ?it/s] Applying cross attention optimization. Error completing request

Oct 09 '22 10:10 mediocreatmybest

Also can't run textual inversion anymore, out of memory by about 500mb on 8gb card. Used to be able to easily before hypernetworks were added.

Oct 09 '22 20:10 Heathen

I had the same issue, but after I clicked on the train button again it works fine for me. There is no "Applying cross attention optimization." log the second time.

Oct 11 '22 11:10 univeous

I'm having the same issue 6GB VRAM and 512MiB reserved 4.59GiB Is there a way to allocate memory better? I think if there was a patch or script that would autohandle memory allocation it would benifit all of us in the long run. Also posted on stack exchange.

File "E:\stable_work_flow_gui\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\__init__.py", line 276, in grad return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 6.00 GiB total capacity; 4.59 GiB already allocated; 0 bytes free; 5.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Oct 11 '22 19:10 Limitlessmatrix

Disabling "Save an image to log directory every N steps" helped with interruptions in hypernetworks training (rtx 2060 super)

Oct 12 '22 03:10 mityarko

This didn't happen to me yesterday (Oct, 14th) now I am running out of CUDA memory when in hypernetwork training. Specifically when it's trying to create a preview image. Set argument --medvram kinda solves it, but the training is going very slow now.

Oct 15 '22 14:10 tuangd

This didn't happen to me yesterday (Oct, 14th) now I am running out of CUDA memory when in hypernetwork training. Specifically when it's trying to create a preview image. Set argument --medvram kinda solves it, but the training is going very slow now.

yeah my understanding is the --medvram etc has a performance hit.

Oct 16 '22 00:10 mediocreatmybest

After today git pull it's working now though. No CUDA out-of-memory warning when hypernetwork training.

Oct 16 '22 06:10 tuangd

As of commit 172c4bc09f0866e7dd114068ebe0f9abfe79ef33, Textual Inversion is working for me again when using the xformers and the "Use cross attention optimizations while training" switched on in the settings tab.

Nov 02 '22 13:11 mediocreatmybest

stable-diffusion-webui stable-diffusion-webui copied to clipboard

Textual Inversion CUDA errors (mentions hypernetwork)

stable-diffusion-webui
stable-diffusion-webui copied to clipboard