stable-diffusion-webui-forge Sudden increase in move model time, freezing PC

Hi, I've generating 100's on images fine for the last couple of month using ForgeUI and some SDXL models; but now its just breaking...

For last 2 days, my move models time has just exploaded to ~ 300secs. A typical run goes like this now:

Run 1024x1024 with 2 loras, no adetailer

Loading models ~ 10 seconds

Moving models ~ 8 seconds

Generate image ~ 16seconds

Freezes at 100% for a couple minutes; PC becomes unusable during this time

Finally finishes with moving models claiming to take 300 or so seconds.

During this time, my RAM seemed to be maxed out, I have 16GB DDR4, 3000Mhz. I've heard this can be a bit low but its been working fine for the last couple months. Apart from that, I've got a 3070Ti, I'm running on Windows 10 and my forge is saved on a M.2 Drive.

Seems odd, I've generated high res images, with more loras & adetailer, all fine. Not suddenly these issues. Any ideas on a fix?

Thanks!!

CMD Copy and paste on the run time:


To create a public link, set \share=True` in `launch()`.`

Startup time: 53.6s (prepare environment: 22.0s, launcher: 0.8s, import torch: 14.3s, initialize shared: 0.4s, other imports: 0.6s, load scripts: 7.0s, create ui: 5.1s, gradio launch: 3.3s).

Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}

[GPU Setting] You will use 87.50% GPU memory (7167.00 MB) to load weights, and use 12.50% GPU memory (1024.00 MB) to do matrix computation.

[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.

StateDict Keys: {'unet': 1680, 'vae': 250, 'text_encoder': 197, 'text_encoder_2': 518, 'ignore': 0}

Working with z of shape (1, 4, 32, 32) = 4096 dimensions.

IntegratedAutoencoderKL Unexpected: ['model_ema.decay', 'model_ema.num_updates']

K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16}

Model loaded in 10.8s (unload existing model: 0.3s, forge model load: 10.6s).

[Unload] Trying to free 3051.58 MB for cuda:0 with 0 models keep loaded ... Done.

[Memory Management] Target: JointTextEncoder, Free GPU: 1738.05 MB, Model Require: 1559.68 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: -845.63 MB, CPU Swap Loaded (blocked method): 1204.12 MB, GPU Loaded: 548.55 MB

Moving model(s) has taken 7.23 seconds

[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 1182.03 MB ... Done.

[Unload] Trying to free 2902.26 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1168.46 MB ... Unload model JointTextEncoder Done.

[Memory Management] Target: KModel, Free GPU: 2010.22 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 986.22 MB, All loaded to GPU.

Moving model(s) has taken 13.94 seconds

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:16<00:00, 1.21it/s]

[Unload] Trying to free 4563.42 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1841.41 MB ... Unload model KModel Done.

[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 6990.30 MB, Model Require: 159.56 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 5806.74 MB, All loaded to GPU.

Moving model(s) has taken 331.09 seconds

Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}

[GPU Setting] You will use 87.50% GPU memory (7167.00 MB) to load weights, and use 12.50% GPU memory (1024.00 MB) to do matrix computation.

Mar 10 '25 17:03 Shmupette

same here its slower after last update i did last night !!

Mar 11 '25 10:03 MIKETUFFIN66

I am using sagemaker and I am facing the following problem. The generation is going as it should, but after some condition, the system freezes completely (probably when changing the model by xyz script, and can not use hres fix). This is what I get in the console

To create a public link, set share=True in launch().
IIB Database file has been successfully backed up to the backup folder.
Startup time: 55.5s (prepare environment: 18.1s, launcher: 0.7s, import torch: 18.7s, initialize shared: 0.1s, other imports: 0.5s, load scripts: 5.2s, create ui: 6.0s, gradio launch: 5.9s).
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
[GPU Setting] You will use 93.14% GPU memory (13893.00 MB) to load weights, and use 6.86% GPU memory (1024.00 MB) to do matrix computation.
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_v01.safetensors', 'hash': 'ad446f67'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/mistoonAnime_v10Illustrious.safetensors', 'hash': '62a07c52'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_v01.safetensors', 'hash': 'ad446f67'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_smoothftSOLID.safetensors', 'hash': '6b45dafe'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
X/Y/Z plot will create 12 images on 3 2x2 grids. (Total steps to process: 300)

Total progress: 0it [00:00, ?it/s]Loading Model: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_smoothftSOLID.safetensors', 'hash': '6b45dafe'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.
StateDict Keys: {'unet': 1680, 'vae': 248, 'text_encoder': 197, 'text_encoder_2': 518, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16}
Model loaded in 40.9s (unload existing model: 0.5s, forge model load: 40.5s).
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-UNet with 1298 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-CLIP with 264 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[Unload] Trying to free 3051.58 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 9793.00 MB, Model Require: 1559.68 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 7209.32 MB, All loaded to GPU.
Moving model(s) has taken 11.02 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7997.05 MB ... Done.
[Unload] Trying to free 2856.18 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7995.55 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 7995.55 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 6971.55 MB, All loaded to GPU.
Moving model(s) has taken 4.97 seconds
  0%|                                                    | 0/25 [00:00<?, ?it/s]
  4%|█▊                                          | 1/25 [00:01<00:37,  1.55s/it]
  8%|███▌                                        | 2/25 [00:02<00:23,  1.04s/it]
 12%|█████▎                                      | 3/25 [00:02<00:19,  1.12it/s]
 16%|███████                                     | 4/25 [00:03<00:17,  1.21it/s]
 20%|████████▊                                   | 5/25 [00:04<00:15,  1.28it/s]
 24%|██████████▌                                 | 6/25 [00:05<00:14,  1.32it/s]
 28%|████████████▎                               | 7/25 [00:05<00:13,  1.34it/s]
 32%|██████████████                              | 8/25 [00:06<00:12,  1.36it/s]
 36%|███████████████▊                            | 9/25 [00:07<00:11,  1.37it/s]
 40%|█████████████████▏                         | 10/25 [00:07<00:10,  1.37it/s]
 44%|██████████████████▉                        | 11/25 [00:08<00:10,  1.38it/s]
 48%|████████████████████▋                      | 12/25 [00:09<00:09,  1.38it/s]
 52%|██████████████████████▎                    | 13/25 [00:10<00:08,  1.38it/s]
 56%|████████████████████████                   | 14/25 [00:10<00:07,  1.38it/s]
 60%|█████████████████████████▊                 | 15/25 [00:11<00:07,  1.38it/s]
 64%|███████████████████████████▌               | 16/25 [00:12<00:06,  1.38it/s]
 68%|█████████████████████████████▏             | 17/25 [00:13<00:05,  1.38it/s]
 72%|██████████████████████████████▉            | 18/25 [00:13<00:05,  1.38it/s]
 76%|████████████████████████████████▋          | 19/25 [00:14<00:04,  1.38it/s]
 80%|██████████████████████████████████▍        | 20/25 [00:15<00:03,  1.38it/s]
 84%|████████████████████████████████████       | 21/25 [00:15<00:02,  1.37it/s]
 88%|█████████████████████████████████████▊     | 22/25 [00:16<00:02,  1.38it/s]
 92%|███████████████████████████████████████▌   | 23/25 [00:17<00:01,  1.38it/s]
 96%|█████████████████████████████████████████▎ | 24/25 [00:18<00:00,  1.38it/s]
100%|███████████████████████████████████████████| 25/25 [00:18<00:00,  1.33it/s]
[Unload] Trying to free 8820.57 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7985.77 MB ... Unload model JointTextEncoder Current free memory is 9746.38 MB ... Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 9746.38 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 8403.26 MB, All loaded to GPU.
Moving model(s) has taken 1.61 seconds
[Unload] Trying to free 3302.87 MB for cuda:0 with 0 models keep loaded ... Current free memory is 9423.13 MB ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 9423.13 MB, Model Require: 1752.98 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 6646.15 MB, All loaded to GPU.
Moving model(s) has taken 0.88 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7657.46 MB ... Done.
[Unload] Trying to free 2856.18 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7655.95 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 7655.95 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 6631.95 MB, All loaded to GPU.
Moving model(s) has taken 1.38 seconds
  0%|                                                    | 0/25 [00:00<?, ?it/s]
  4%|█▊                                          | 1/25 [00:00<00:16,  1.45it/s]

100%|███████████████████████████████████████████| 25/25 [00:18<00:00,  1.38it/s]
[Unload] Trying to free 8405.72 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7633.74 MB ... Unload model JointTextEncoder Current free memory is 9398.80 MB ... Done.
Memory cleanup has taken 0.69 seconds
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-UNet with 1298 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-CLIP with 264 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[Unload] Trying to free 3302.87 MB for cuda:0 with 0 models keep loaded ... Current free memory is 9399.05 MB ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 9399.05 MB, Model Require: 1752.98 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 6622.07 MB, All loaded to GPU.
Moving model(s) has taken 1.07 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7629.58 MB ... Done.
[Unload] Trying to free 2856.18 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7628.08 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 7628.08 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 6604.08 MB, All loaded to GPU.
Moving model(s) has taken 3.40 seconds
  0%|                                                    | 0/25 [00:00<?, ?it/s]
  4%|█▊                                          | 1/25 [00:00<00:17,  1.41it/s]

100%|███████████████████████████████████████████| 25/25 [00:18<00:00,  1.38it/s]
[Unload] Trying to free 8405.72 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7624.61 MB ... Unload model JointTextEncoder Current free memory is 9393.46 MB ... Done.
Memory cleanup has taken 1.14 seconds
[Unload] Trying to free 3302.87 MB for cuda:0 with 0 models keep loaded ... Current free memory is 9393.70 MB ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 9393.70 MB, Model Require: 1752.98 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 6616.72 MB, All loaded to GPU.
Moving model(s) has taken 0.88 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7625.65 MB ... Done.
[Unload] Trying to free 2856.18 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7624.15 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 7624.15 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 6600.15 MB, All loaded to GPU.
Moving model(s) has taken 1.40 seconds
  0%|                                                    | 0/25 [00:00<?, ?it/s]
  4%|█▊                                          | 1/25 [00:00<00:16,  1.43it/s]

100%|███████████████████████████████████████████| 25/25 [00:18<00:00,  1.36it/s]
[Unload] Trying to free 8405.72 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7625.64 MB ... Unload model JointTextEncoder Current free memory is 9393.08 MB ... Done.
Memory cleanup has taken 0.69 seconds
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_v01.safetensors', 'hash': 'ad446f67'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_v01.safetensors', 'hash': 'ad446f67'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 9393.32 MB ... Unload model KModel Current free memory is 14390.65 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 1680, 'vae': 248, 'text_encoder': 196, 'text_encoder_2': 518, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16}
Model loaded in 44.0s (unload existing model: 2.1s, forge model load: 42.0s).
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-UNet with 1298 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[LORA] Loaded /home/studio-lab-user/stable-diffusion-webui-forge/models/Lora/illu_v0_1_color_spot_v1_0.safetensors for KModel-CLIP with 264 keys at weight 1.0 (skipped 0 keys) with on_the_fly = False
[Unload] Trying to free 3051.58 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 9723.21 MB, Model Require: 1559.68 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 7139.53 MB, All loaded to GPU.
Moving model(s) has taken 10.45 seconds
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 7959.51 MB ... Done.
[Unload] Trying to free 2856.18 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7958.01 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 7958.01 MB, Model Require: 0.00 MB, Previously Loaded: 4897.05 MB, Inference Require: 1024.00 MB, Remaining: 6934.01 MB, All loaded to GPU.
Moving model(s) has taken 2.51 seconds
  0%|                                                    | 0/25 [00:00<?, ?it/s]
  4%|█▊                                          | 1/25 [00:00<00:16,  1.41it/s]

100%|███████████████████████████████████████████| 25/25 [00:18<00:00,  1.37it/s]
[Unload] Trying to free 8820.57 MB for cuda:0 with 0 models keep loaded ... Current free memory is 7972.04 MB ... Unload model JointTextEncoder Current free memory is 9734.52 MB ... Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 9734.52 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 8391.40 MB, All loaded to GPU.
Moving model(s) has taken 0.78 seconds
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_smoothftSOLID.safetensors', 'hash': '6b45dafe'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': '/home/studio-lab-user/stable-diffusion-webui-forge/models/Stable-diffusion/tmp_models/illustriousXL_v01.safetensors', 'hash': 'ad446f67'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 9412.64 MB ... Unload model KModel

Mar 13 '25 12:03 xRikishi

I'm still encountering an incomprehensible error. I completely reinstalled the interface. Everything works until the image is saved and the memory is released. At this point, everything freezes

Mar 17 '25 14:03 xRikishi

After a fresh install didnt fix it, I ended up buying more RAM, went from 16GB to 32GB. When running it uses around 20GB of memory for me.

My workflow hasnt changed so doesnt make too much sense that my memory requirement did, but its now fixed. So maybe I've still got the issue but just beating it with more RAM or maybe the requirements just got higher?

Mar 17 '25 14:03 Shmupette

@Shmupette

[GPU Setting] You will use 87.50% GPU memory (7167.00 MB) to load weights, and use 12.50% GPU memory (1024.00 MB) to do matrix computation.

For the GPU Weights slider at the top of the screen, change that number to ~5500. You're not leaving enough VRAM for the VAE and/or Loras it looks like. The weights slider determines how much VRAM is used for loading the data from the model, Loras, etc. The remaining number is used for inference (computing the actual generation and decoding with VAE).

@xRikishi, You're in the same position

[GPU Setting] You will use 93.14% GPU memory (13893.00 MB) to load weights, and use 6.86% GPU memory (1024.00 MB) to do matrix computation.

You can set your weights to ~9500

For both of you, keep in mind other processes on your PC can be using VRAM, which will subtract from the 1,024 that you had left over.

Apr 09 '25 06:04 MisterChief95