stable-diffusion-webui-forge
stable-diffusion-webui-forge copied to clipboard
[Bug]: Forge WebUI taking almost one hour to generate a single image in Macbook Air M1
Checklist
- [X] The issue exists after disabling all extensions
- [X] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
- [X] The issue exists in the current version of the webui
- [X] The issue has not been reported before recently
- [ ] The issue has been reported before but has not been fixed yet
What happened?
Txt2img is taking too long in Macbook Air M1. After running a fresh install of the git repository, added a model, a VAE, an some LoRas, the generation process takes almost 1 hour per image of 768x1152, with/without upscaling/refining. Is this normal?
Steps to reproduce the problem
On a fresh install of Forge WebUI in Macbook Air M1, using this configuration:
score_9, score_8_up, score_7_up, score_6_up,OverallDetail, 1girl, solo, (tiefling), very long hair, white hair, bangs, ponytail, braided hair, long pointed ear, black makeup, thin body, (white skin), thigh gap, tail tiefling, black gloves, erotic pose, (sexy red clothing, sexy black stockings), (provocative look) ,concept art,illustration,realistic,Expressiveh,knva,perfect body,highly detailed,delicate and smooth skin, body in motion, lora:add-detail-xl:1 <lora:Concept Art Twilight Style SDXL_LoRA_Pony Diffusion V6 XL:1> lora:Expressive_H:0.8 lora:Kenva:0.7 lora:sinfully_stylish_SDKL:1 lora:xl_more_art-full_v1:0.7 Negative prompt: score_6, score_5, score_4, negativeXL_D, 3d Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 666, Size: 768x1152, Model hash: 67ab2fd8ec, Model: ponyDiffusionV6XL_v6StartWithThisOne, VAE hash: 2125bad8d3, VAE: xlVAEC_f1.safetensors, Denoising strength: 0.7, Hires upscale: 1.5, Hires steps: 5, Hires upscaler: Latent, Lora hashes: "add-detail-xl: 9c783c8ce46c, Concept Art Twilight Style SDXL_LoRA_Pony Diffusion V6 XL: e5fe96cd307b, Expressive_H: 5671f20a9a6b, Kenva: cfa45d23d34c, sinfully_stylish_SDKL: 076fa4d920a9, xl_more_art-full_v1: fe3b4816be83", TI hashes: "negativeXL_D: fff5d51ab655, negativeXL_D: fff5d51ab655, negativeXL_D: fff5d51ab655, negativeXL_D: fff5d51ab655", Version: f0.0.17v1.8.0rc-latest-276-g29be1da7
Time taken: 47 min. 26.2 sec.
What should have happened?
As far as I understand, long generation process shouldn't take longer than 5-10min. One hour seems excessive to me, but maybe I am missing something important here.
What browsers do you use to access the UI ?
Mozilla Firefox
Sysinfo
Console logs
USERNAME@HOME stable-diffusion-webui-forge main ./webui.sh
################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################
################################################################
Running on USERNAME user
################################################################
################################################################
Repo already cloned, using it as install directory
################################################################
################################################################
Create and activate python venv
################################################################
################################################################
Launching launch.py...
################################################################
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)]
Version: f0.0.17v1.8.0rc-latest-276-g29be1da7
Commit hash: 29be1da7cf2b5dccfc70fbdd33eb35c56a31ffb7
Legacy Preprocessor init warning: Unable to install insightface automatically. Please try run `pip install insightface` manually.
Launching Web UI with arguments: --skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate
Total VRAM 16384 MB, total RAM 16384 MB
Set vram state to: SHARED
Device: mps
VAE dtype: torch.float32
CUDA Stream Activated: False
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
==============================================================================
You are running torch 2.1.0.
The program is tested to work with torch 2.1.2.
To reinstall the desired version, run with commandline flag --reinstall-torch.
Beware that this will cause a lot of large files to be downloaded, as well as
there are reports of issues with training tab on the latest version.
Use --skip-version-check commandline argument to disable this check.
==============================================================================
ControlNet preprocessor location: /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/ControlNetPreprocessor
Loading weights [67ab2fd8ec] from /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Stable-diffusion/ponyDiffusionV6XL_v6StartWithThisOne.safetensors
2024-05-16 08:24:35,896 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Startup time: 10.9s (prepare environment: 0.6s, import torch: 3.4s, import gradio: 1.2s, setup paths: 1.3s, other imports: 1.6s, load scripts: 1.3s, create ui: 0.6s, gradio launch: 0.7s).
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
Loading VAE weights specified in settings: /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/VAE/xlVAEC_f1.safetensors
To load target model SDXLClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 18.9s (load weights from disk: 0.9s, forge load real models: 15.0s, load VAE: 0.5s, calculate empty prompt: 2.4s).
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/add-detail-xl.safetensors for SDXL-UNet with 722 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/add-detail-xl.safetensors for SDXL-CLIP with 264 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Concept Art Twilight Style SDXL_LoRA_Pony Diffusion V6 XL.safetensors for SDXL-UNet with 722 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Concept Art Twilight Style SDXL_LoRA_Pony Diffusion V6 XL.safetensors for SDXL-CLIP with 264 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Expressive_H-000001.safetensors for SDXL-UNet with 722 keys at weight 0.8 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Expressive_H-000001.safetensors for SDXL-CLIP with 264 keys at weight 0.8 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Kenva.safetensors for SDXL-UNet with 722 keys at weight 0.7 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/Kenva.safetensors for SDXL-CLIP with 264 keys at weight 0.7 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/sinfully_stylish_SDXL.safetensors for SDXL-UNet with 722 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/sinfully_stylish_SDXL.safetensors for SDXL-CLIP with 264 keys at weight 1.0 (skipped 0 keys)
[LORA] Loaded /Users/USERNAME/Projects/stable-diffusion-webui-forge/models/Lora/xl_more_art-full_v1.safetensors for SDXL-UNet with 788 keys at weight 0.7 (skipped 0 keys)
To load target model SDXLClipModel
Begin to load 1 model
Reuse 1 loaded models
Moving model(s) has taken 5.69 seconds
To load target model SDXL
Begin to load 1 model
Moving model(s) has taken 63.00 seconds
40%|█████████████████▌ | 8/20 [12:13<18:19, 91.65s/it]
To load target model AutoencoderKL | 8/25 [13:31<29:04, 102.60s/it]
Begin to load 1 model
Moving model(s) has taken 2.36 seconds
Total progress: 32%|████████▋ | 8/25 [13:43<29:10, 102.97s/it]
Total progress: 32%|████████▋ | 8/25 [13:43<29:04, 102.60s/it]
Additional information
No response