stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: Constant hanging/freezing on image generation

Open Ziehn opened this issue 3 years ago • 33 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

After the recent commits from the 25th onward, I get constant hangups/freezing on image generation, no errors in the log. it/s will be absurdly high when it unfreezes at 20s/it+

Rolling back to a stable version on the 24th fixes all freezing

GTX 1080 TI

Steps to reproduce the problem

Generate image

Watch as it freezes

Get image 2 mins later than usual

What should have happened?

Smooth and timely image generation

Commit where the problem happens

955df77

What platforms do you use to access the UI ?

Windows

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

--xformers --no-half-vae

List of extensions

Happens with or without extensions

Stable-Diffusion-Webui-Civitai-Helper a1111-sd-webui-tagcomplete sd-dynamic-prompts sd-dynamic-thresholding sd-webui-controlnet stable-diffusion-webui-composable-lora stable-diffusion-webui-images-browser

Console logs

venv "D:\Stable Diffusion\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.10 (tags/v3.10.10:aad5f6a, Feb  7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
Commit hash: 955df7751eef11bb7697e2d77f6b8a6226b21e13
Installing requirements for Web UI
Installing sd-dynamic-prompts requirements.txt



Launching Web UI with arguments: --xformers --no-half-vae
D:\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torchvision\transforms\functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Civitai Helper: Get Custom Model Folder
Civitai Helper: Load setting from: D:\Stable Diffusion\stable-diffusion-webui\extensions\Stable-Diffusion-Webui-Civitai-Helper\setting.json
Civitai Helper: No setting file, use default
Loading weights [75bcab05df] from D:\Stable Diffusion\stable-diffusion-webui\models\Stable-diffusion\Z-Mix2.2.safetensors
Creating model from config: D:\Stable Diffusion\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights specified in settings: D:\Stable Diffusion\stable-diffusion-webui\models\VAE\Anything-V3.0.vae.pt
Applying xformers cross attention optimization.
Textual inversion embeddings loaded(31): abcdef_mirajane, advntr, albino_style, aurate, charturnerv2, corneo_bowsette, easynegative, ng_deepnegative_v1_75t, pastel_style, RebeccaEdgerunners, rem_rezero, was-battletech, yoko v1
Model loaded in 4.4s (load weights from disk: 0.4s, create model: 0.4s, apply weights to model: 0.7s, apply half(): 0.6s, load VAE: 0.6s, move model to device: 0.7s, load textual inversion embeddings: 1.0s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 14.4s (import torch: 2.3s, import gradio: 1.1s, import ldm: 0.3s, other imports: 1.1s, load scripts: 2.3s, load SD checkpoint: 4.5s, create ui: 2.6s, gradio launch: 0.1s).
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:14<00:00,  3.73s/it]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:27<00:00,  1.35s/it]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:27<00:00,  1.32s/it]

Additional information

No response

Ziehn avatar Mar 28 '23 06:03 Ziehn

same issue here +1

chifeisoong avatar Mar 28 '23 06:03 chifeisoong

Starting to look like some sort of incompatibility was introduced with Firefox. Been testing with Edge and not getting any freezing so far

same issue here +1

Which web browser are you using?

Ziehn avatar Mar 28 '23 10:03 Ziehn

It freezes for me on Opera OSX. I will try Google Chrome. But I also rolled back to a commit from a week ago already. Never had this before, but I am not sure which commit I was using the last weeks (I just updated yesterday and everything broke). master is completely unusable for me right now. It will hang loading the GUIs.

Edit: I can confirm that loading master with Google Chrome (and Edge) actually works. So for me the incompatibility with the current master has to do with Opera. I did not check if image generation still hangs.

Edit2: Still hangs randomly after generating some images (in Chrome). Will test Edge next.

Edit3: It actually does not seem to hang when using with Edge. Wow... Feels like in the early 2000s with every Browser working different for any given HTML.

Edit4: Well, now it hung with Edge too. Just seems to happen much more seldom when using it.

oderwat avatar Mar 28 '23 13:03 oderwat

I would try disabling all extensions that aren't built-in first.

I know that sd-dynamic-prompts freezes my browser right now. Using the base Stable Diffusion install without extensions seems to work fine. If that works for you, I'd then slowly enable extensions at a time to isolate the issue.

notkmikiya avatar Mar 29 '23 04:03 notkmikiya

Already tested without extensions, first thing I tried, no dice.

Ziehn avatar Mar 29 '23 10:03 Ziehn

Already tested without extensions, first thing I tried, no dice.

@Ziehn That's pretty rough, not sure what's causing that. Maybe there's something that can be picked up in the browser dev tools console log? I saw some people having browser issues doing this to help #9027

Also, it looks like you're using torch: 2.0.0+cu118 and calling --xformers. Did you recompile xformers onto the new pytorch?

I have a friend with a 1080 TI and using --xformers seemed to hurt performance more than help it. Instead, using --opt-sdp-attention or --opt-sdp-no-mem-attention gave a decent performance boost of about 1it/s for him.

notkmikiya avatar Mar 29 '23 18:03 notkmikiya

@notkmikiya Tried both of your suggested Arguments, neither did anything with nor without xformers. I can also confirm I get slower generation without xformers

As well, generation speeds back up when closing the afflicted web browser. No change in speed closing Edge.

Ziehn avatar Mar 29 '23 20:03 Ziehn

I could find a backup of the version I used for a long time before everything started to fall apart and be back on a9fed7c364061ae6efb37f797b6b522cb3cf7aa2, it works using all the code + venv from the backup as also with my newest venv (torch 2) and just running that commit. I guess I check occasionally if stuff works, but that old version can do all I want.

oderwat avatar Mar 29 '23 22:03 oderwat

This issue may have something to do with --no-half-vae on the recent updates. My friend had this same issue today after placing that into his webui-user.bat. After removing it, it went away.

notkmikiya avatar Mar 30 '23 19:03 notkmikiya

I have the same issue as of a recent commit, and I am not using --no-half-vae happens on both GV100 and 1080Ti, using torch2 and xformers, will try falling back to the commit mentioned by @oderwat

update: unfortunately I am still getting the same issue, generation will just hang with no message, issue happens regardless of browser used (tried Chrome and Firefox)

update2: seems this is an issue with previews? i have not had the error since disabling previews, frustrating, but works for now, will try bumping to the latest commit again and see if everything keeps working

update3: no issues on the newest commit with previews disabled

thot-experiment avatar Apr 02 '23 08:04 thot-experiment

@thot-experiment seems to have found the issue. Something to do with live previews is causing SD to infinitely loop (100% gpu usage, never finishes).

It also only seems to happens with torch 2.0.0+cu118 and --opt-sdp-no-mem-attention. My GPU is a 4070ti.

I was running live previews with preview mode full and a fairly low update period of 400ms.

It always seems to hang up just as an image is finishing, so I assume it is caused by some of the code which finishes up the image in some way? I am not at all familiar with any of this, but when I attached with VS Code is seemed to be hung up on the decode_first_stage call at around processing.py:655. Although this is literally the first time I've ever attached to a python program, so that might be meaningless.

EDIT: Just remembered, I also had the live preview sample step rate set to 1, which might increase the chance of this happening.

sliftist avatar Apr 02 '23 19:04 sliftist

FWIW i do not have --opt-sdp-no-mem-attention set explicitly, but perhaps it gets turned on implicitly by some other flag or configuration state? (i don't even see it listed in the docs)

thot-experiment avatar Apr 02 '23 19:04 thot-experiment

It also only seems to happens with torch 2.0.0+cu118 and --opt-sdp-no-mem-attention. My GPU is a 4070ti

I'm also on torch 2.0.0+cu118 and experiencing this issue on an RTX 2070 with --opt-sdp-attention . The issue persists with and without live previews. However, it seems to only happen sometimes. Some images process and upscale normally, while others finish processing, then hang when saving the image. After a couple minutes it processes, but this is not normal behavior. On commit a9fed7c3, this is not an issue for me.

Daemonrat avatar Apr 04 '23 15:04 Daemonrat

For me this happens too on rtx 2060. Also the general ui responds SUPER sluggish with a delay to many different actions.

chille9 avatar Apr 18 '23 02:04 chille9

Same thing happening to me occasionally since updating to torch: 2.0.0+cu118, running with --opt-sdp-no-mem-attention, and without --xformers.

Zullian avatar Apr 22 '23 00:04 Zullian

Happens to me using a 2070 Super but on torch: 1.13.1+cu117. Happens more often when I do other stuff on my pc, especially Photoshop or Lightroom totally freezes SD. But other actions too...

Stibo avatar Apr 22 '23 18:04 Stibo

I started seeing this problem after upgrading my environment from Python 3.9 to Python 3.10 (and rebuilding the venv.) Notably, Xformers went from 0.0.14 to 0.0.17 as part of this upgrade. Using Brave browser.

ThereforeGames avatar Apr 23 '23 02:04 ThereforeGames

This issue still exists with the latest 1.1.1 version. GPU usage will go to 100% and stay there.

Deleting venv and installing everything from scratch didn't help

EDIT: It only happens when using torch 2.0.0+cu118 - never seen it happening on torch 1.13.1+cu117

andypotato avatar May 03 '23 01:05 andypotato

EDIT: It only happens when using torch 2.0.0+cu118 - never seen it happening on torch 1.13.1+cu117

I'm pulling my hair out because of this, I am not tech savvy how do I downgrade torch 2.0 to torch 1.13.1?

DeonHolo avatar May 07 '23 19:05 DeonHolo

EDIT: It only happens when using torch 2.0.0+cu118 - never seen it happening on torch 1.13.1+cu117

I'm pulling my hair out because of this, I am not tech savvy how do I downgrade torch 2.0 to torch 1.13.1?

Delete your venv folder and redownload is the easiest way

Ziehn avatar May 07 '23 20:05 Ziehn

Delete your venv folder and redownload is the easiest way

This can help with dependency issues after upgrades but does NOT solve this issue.

For me the problem went away after I changed the number of previews generated from "every 3 steps" to "every 5 steps". Haven't seen the issue again after that.

andypotato avatar May 07 '23 23:05 andypotato

Delete your venv folder and redownload is the easiest way

This can help with dependency issues after upgrades but does NOT solve this issue.

For me the problem went away after I changed the number of previews generated from "every 3 steps" to "every 5 steps". Haven't seen the issue again after that.

I'm aware? This was in response to a user wanting to downgrade to torch 1.13.1

Only fix I've found for this hanging issue is to move on from AUTO1111. Vlad seems to be working much better for me

Ziehn avatar May 08 '23 00:05 Ziehn

For me the problem went away after I changed the number of previews generated from "every 3 steps" to "every 5 steps". Haven't seen the issue again after that.

Is this in the Live Preview setting where it says "Show new live preview image every N sampling steps. Set to -1 to show after completion of batch."? Mine was set to 10 on default it wasn't on 3.

EDIT: Still hangs even with this change. I'm just gonna wait for a while until an official fix is out. Please reply if there already is~

DeonHolo avatar May 08 '23 02:05 DeonHolo

Yes that's the setting I was talking about. I think the value that works for you depends on what kind of GPU you are using. I'm using a 3060 / 12GB and I can get away with a preview image every 5 sampling steps.

You could try disabling the live preview completely by setting it to -1 first. If this makes the issue go away then at least you have a workaround.

andypotato avatar May 08 '23 05:05 andypotato

I have a 3060ti. I am going to try setting it to -1.

DeonHolo avatar May 08 '23 05:05 DeonHolo

Changing live preview from 1 to 0 fixed the freezing issue for me. 2070 Super 8GB. Thanks!

kimraven11 avatar May 08 '23 06:05 kimraven11

Tried changing live preview to -1, 0, and 5. Still have the problem hNHGNGNHNHN GNN

DeonHolo avatar May 08 '23 22:05 DeonHolo

~~1.20 appears to fix it haven't had any issues after upgrading~~

nevermind

zer0mania avatar May 14 '23 00:05 zer0mania

Now I am not really sure. I had the same exact problem when I'm playing Diablo 4. GPU goes 100% utilization and everything becomes sluggish/slows down. Every action I do slows down including Google browser, the game, file explorer, etc. Maybe it's a GPU problem?

DeonHolo avatar May 14 '23 08:05 DeonHolo

Ok I am not sure but I removed the --no-half-vae argument in my commandline and it seems to fix it?

DeonHolo avatar May 15 '23 21:05 DeonHolo