stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Textual Inversion CUDA errors (mentions hypernetwork)

Open Rartist opened this issue 2 years ago • 12 comments

Describe the bug I was able to test out / use Textual Inversion 2 or 3 days ago. Suddenly I run into CUDA errors, even when I am trying to train on different models.

Traceback (most recent call last):
  File "F:\StableDiffusion\stable-diffusion-webui\modules\ui.py", line 158, in f
    res = list(func(*args, **kwargs))
  File "F:\StableDiffusion\stable-diffusion-webui\webui.py", line 65, in f
    res = func(*args, **kwargs)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\textual_inversion\ui.py", line 29, in train_embedding
    embedding, filename = modules.textual_inversion.textual_inversion.train_embedding(*args)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 223, in train_embedding
    loss.backward()
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\function.py", line 253, in apply
    return user_fn(self, *args)
  File "F:\StableDiffusion\stable-diffusion-webui\repositories\stable-diffusion\ldm\modules\diffusionmodules\util.py", line 138, in backward
    output_tensors = ctx.run_function(*shallow_copies)
  File "F:\StableDiffusion\stable-diffusion-webui\repositories\stable-diffusion\ldm\modules\attention.py", line 243, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "F:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "F:\StableDiffusion\stable-diffusion-webui\modules\hypernetwork.py", line 75, in attention_CrossAttention_forward
    sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 6.00 GiB total capacity; 4.94 GiB already allocated; 0 bytes free; 5.10 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Desktop (please complete the following information):

  • OS: Windows 10
  • Commit revision 87db6f01cc6b118fe0c82c36c6686d72d060c417

Additional context Last time I used it was before the implementation of Hyper Networks.

Rartist avatar Oct 08 '22 07:10 Rartist

Yep, I'm getting CUDA out of memory now while trying to use Textual Inversion. Previously ran without issues. FYI I was previously using Textual Inversion with the WebGUI with 6GB of VRAM.

mediocreatmybest avatar Oct 08 '22 09:10 mediocreatmybest

As a quick follow up, rolled back to commit 2995107fa24cfd72b0a991e18271dcde148c2807 and looks to be working fine.

mediocreatmybest avatar Oct 08 '22 10:10 mediocreatmybest

As a quick follow up, rolled back to commit 2995107 and looks to be working fine.

I tried that, but can't confirm. Neither with base 1_4 nor a merged CKPT.

Rartist avatar Oct 09 '22 08:10 Rartist

Thought I'd check again as it looks like quite a bit of additional work is being done since the Hypernetwork inclusion. Yep, my configuration is Windows 10, Standard Stable Diffusion Model 1.4. No Hypernetwork files/folders.

RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 6.00 GiB total capacity; 4.58 GiB already allocated; 0 bytes free; 4.84 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

In the console errors out just after applying cross attention optimization.

Preparing dataset... 100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:07<00:00, 6.50it/s] 0%| | 0/1000 [00:00<?, ?it/s] Applying cross attention optimization. Error completing request

mediocreatmybest avatar Oct 09 '22 10:10 mediocreatmybest

Also can't run textual inversion anymore, out of memory by about 500mb on 8gb card. Used to be able to easily before hypernetworks were added.

Heathen avatar Oct 09 '22 20:10 Heathen

I had the same issue, but after I clicked on the train button again it works fine for me. There is no "Applying cross attention optimization." log the second time.

univeous avatar Oct 11 '22 11:10 univeous

I'm having the same issue 6GB VRAM and 512MiB reserved 4.59GiB Is there a way to allocate memory better? I think if there was a patch or script that would autohandle memory allocation it would benifit all of us in the long run. Also posted on stack exchange.

File "E:\stable_work_flow_gui\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\__init__.py", line 276, in grad return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 6.00 GiB total capacity; 4.59 GiB already allocated; 0 bytes free; 5.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Limitlessmatrix avatar Oct 11 '22 19:10 Limitlessmatrix

Disabling "Save an image to log directory every N steps" helped with interruptions in hypernetworks training (rtx 2060 super)

mityarko avatar Oct 12 '22 03:10 mityarko

This didn't happen to me yesterday (Oct, 14th) now I am running out of CUDA memory when in hypernetwork training. Specifically when it's trying to create a preview image. Set argument --medvram kinda solves it, but the training is going very slow now.

tuangd avatar Oct 15 '22 14:10 tuangd

This didn't happen to me yesterday (Oct, 14th) now I am running out of CUDA memory when in hypernetwork training. Specifically when it's trying to create a preview image. Set argument --medvram kinda solves it, but the training is going very slow now.

yeah my understanding is the --medvram etc has a performance hit.

mediocreatmybest avatar Oct 16 '22 00:10 mediocreatmybest

After today git pull it's working now though. No CUDA out-of-memory warning when hypernetwork training.

tuangd avatar Oct 16 '22 06:10 tuangd

As of commit 172c4bc09f0866e7dd114068ebe0f9abfe79ef33, Textual Inversion is working for me again when using the xformers and the "Use cross attention optimizations while training" switched on in the settings tab.

mediocreatmybest avatar Nov 02 '22 13:11 mediocreatmybest