stable-diffusion-webui-docker
stable-diffusion-webui-docker copied to clipboard
CPU mode loading float16 format, needs float32
Unless I've made mistake somewhere, it appears the CPU version is attempting to load a GPU (float32) model. I was however able to solve the problem and get this container up and running! So I'm posting this here for anybody with similar problems to me.
My docker-compose.yaml
:
version: '3.8'
services:
stable-diffusion:
image: docker.io/siutin/stable-diffusion-webui-docker:latest-cpu
volumes:
- ./models:/app/stable-diffusion-webui/models
- ./outputs:/app/stable-diffusion-webui/outputs
ports:
- 7860:7860
container_name: stable-diffusion
network_mode: host
restart: unless-stopped
command: "bash webui.sh --skip-torch-cuda-test --use-cpu all --share"
extra_hosts:
- "127.0.0.1:0.0.0.0"
My log:
podman start -a stable-diffusion
[stable-diffusion] |
[stable-diffusion] | ################################################################
[stable-diffusion] | Install script for stable-diffusion + Web UI
[stable-diffusion] | Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
[stable-diffusion] | ################################################################
[stable-diffusion] |
[stable-diffusion] | ################################################################
[stable-diffusion] | Running on app user
[stable-diffusion] | ################################################################
[stable-diffusion] |
[stable-diffusion] | ################################################################
[stable-diffusion] | Repo already cloned, using it as install directory
[stable-diffusion] | ################################################################
[stable-diffusion] |
[stable-diffusion] | ################################################################
[stable-diffusion] | Create and activate python venv
[stable-diffusion] | ################################################################
[stable-diffusion] |
[stable-diffusion] | ################################################################
[stable-diffusion] | Launching launch.py...
[stable-diffusion] | ################################################################
[stable-diffusion] | Using TCMalloc: libtcmalloc_minimal.so.4
[stable-diffusion] | Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
[stable-diffusion] | Version: v1.7.0
[stable-diffusion] | Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
[stable-diffusion] | Launching Web UI with arguments: --skip-torch-cuda-test --use-cpu all --share
[stable-diffusion] | no module 'xformers'. Processing without...
[stable-diffusion] | no module 'xformers'. Processing without...
[stable-diffusion] | No module 'xformers'. Proceeding without it.
[stable-diffusion] | Style database not found: /app/stable-diffusion-webui/styles.csv
[stable-diffusion] | Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled
[stable-diffusion] | Downloading: "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" to /app/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
[stable-diffusion] |
100%|██████████| 3.97G/3.97G [07:06<00:00, 9.99MB/s]
[stable-diffusion] | Calculating sha256 for /app/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors: Running on local URL: http://127.0.0.1:7860
[stable-diffusion] | Running on public URL: https://dd6fa6e634c91b8160.gradio.live
[stable-diffusion] |
[stable-diffusion] | This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
[stable-diffusion] | Startup time: 439.6s (import torch: 2.6s, import gradio: 0.9s, setup paths: 0.8s, other imports: 0.5s, list SD models: 427.1s, load scripts: 0.7s, create ui: 0.6s, gradio launch: 6.0s).
[stable-diffusion] | 6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa
[stable-diffusion] | Loading weights [6ce0161689] from /app/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
[stable-diffusion] | Creating model from config: /app/stable-diffusion-webui/configs/v1-inference.yaml
vocab.json: 100%|██████████| 961k/961k [00:00<00:00, 4.72MB/s]
merges.txt: 100%|██████████| 525k/525k [00:00<00:00, 7.50MB/s]
special_tokens_map.json: 100%|██████████| 389/389 [00:00<00:00, 1.99MB/s]
tokenizer_config.json: 100%|██████████| 905/905 [00:00<00:00, 5.89MB/s]
config.json: 100%|██████████| 4.52k/4.52k [00:00<00:00, 21.9MB/s]
[stable-diffusion] | Applying attention optimization: InvokeAI... done.
loading stable diffusion model: RuntimeError
Traceback (most recent call last):
File "/app/miniconda3/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/app/miniconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/app/miniconda3/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/app/stable-diffusion-webui/modules/initialize.py", line 147, in load_model
shared.sd_model # noqa: B018
File "/app/stable-diffusion-webui/modules/shared_items.py", line 128, in sd_model
return modules.sd_models.model_data.get_sd_model()
File "/app/stable-diffusion-webui/modules/sd_models.py", line 531, in get_sd_model
load_model()
File "/app/stable-diffusion-webui/modules/sd_models.py", line 681, in load_model
sd_model.cond_stage_model_empty_prompt = get_empty_cond(sd_model)
File "/app/stable-diffusion-webui/modules/sd_models.py", line 569, in get_empty_cond
return sd_model.cond_stage_model([""])
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/modules/sd_hijack_clip.py", line 234, in forward
z = self.process_tokens(tokens, multipliers)
File "/app/stable-diffusion-webui/modules/sd_hijack_clip.py", line 273, in process_tokens
z = self.encode_with_transformers(tokens)
File "/app/stable-diffusion-webui/modules/sd_hijack_clip.py", line 326, in encode_with_transformers
outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=-opts.CLIP_stop_at_last_layers)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 822, in forward
return self.text_model(
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 740, in forward
encoder_outputs = self.encoder(
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 654, in forward
layer_outputs = encoder_layer(
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 382, in forward
hidden_states = self.layer_norm1(hidden_states)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 531, in network_LayerNorm_forward
return originals.LayerNorm_forward(self, input)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
return F.layer_norm(
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2515, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
Stable diffusion model failed to load
Exception in thread Thread-3 (load_model):
Traceback (most recent call last):
File "/app/miniconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/app/miniconda3/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/app/stable-diffusion-webui/modules/initialize.py", line 153, in load_model
devices.first_time_calculation()
File "/app/stable-diffusion-webui/modules/devices.py", line 162, in first_time_calculation
linear(x)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/app/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 486, in network_Linear_forward
return originals.Linear_forward(self, input)
File "/app/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
These errors,
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
stand out to me as indicating that the model is attempting to load in float16.
I could very well be wrong of course; I am only basing this off of the StackOverflow discussion here, https://stackoverflow.com/questions/75641074/i-run-stable-diffusion-its-wrong-runtimeerror-layernormkernelimpl-not-implem.
In that same discussion it is mentioned that the model can be forced to load in float32 with the option --no-half
. If I add that to my command:
line,
command: "bash webui.sh --skip-torch-cuda-test --use-cpu all --no-half --share"
then the log stops at,
[stable-diffusion] | Applying attention optimization: InvokeAI... done.
[stable-diffusion] | Model loaded in 12.5s (calculate hash: 10.1s, load weights from disk: 0.1s, create model: 1.2s, apply weights to model: 0.9s, calculate empty prompt: 0.1s).
After which point I am able to load into the web UI and successfully generate images.
Thank you very much, your problem is the same as mine, adding no half is effective