stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Bug]: Size mismatch for model.diffusion_model.output_block
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
Already up to date.
venv "G:\StableDiffusion\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: 6074175faa751dde933aa8e15cd687ca4e4b4a23
Installing requirements for Web UI
Launching Web UI with arguments: --xformers --disable-safe-unpickle --allow-code --autolaunch --theme dark --deepdanbooru
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading weights [09dd2ae4] from G:\StableDiffusion\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.ckpt
Global Step: 875000
Traceback (most recent call last):
File "G:\StableDiffusion\stable-diffusion-webui\launch.py", line 255, in
Steps to reproduce the problem
- Installed a fresh version of AUTO1111 via git pull
- copied the appropriate models and .yaml to the correct directories
- run webui-user.bat and received this error:
What should have happened?
WebUI would automatically open and I could create an image
Commit where the problem happens
6074175faa751dde933aa8e15cd687ca4e4b4a23
What platforms do you use to access UI ?
Windows
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
@echo off
set PYTHON=C:\Python310\python.exe
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--xformers --disable-safe-unpickle --allow-code --autolaunch --theme dark --deepdanbooru --vae-path "models\Stable-diffusion\vae-ft-mse-840000-ema-pruned.pt"
set CUDA_VISIBLE_DEVICES=1
git pull
call webui.bat
Additional information, context and logs
No response
I had the same problem, I mis copied the yaml file. In the SD models directory does 768-v-ema.yaml show as a text file or a yaml file?
yaml And is copied inside stable-diffusion-webui\models\Stable-diffusion
What did you do to fix?
I just re-saved the git raw file as 768-v-ema and got 768-v-ema.yaml rather then 768-v-ema.yaml.txt
Same mistake
RuntimeError: Error(s) in loading state_dict for LatentDiffusion: size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
Where do you even get that 768-v-ema.yaml file? I cannot find it in the docs, dependencies or google. Does it get created somehow? Feeling really stupid here.
Where do you even get that 768-v-ema.yaml file? I cannot find it in the docs, dependencies or google. Does it get created somehow? Feeling really stupid here.
https://github.com/Stability-AI/stablediffusion/tree/main/configs/stable-diffusion
still not working for me tho :(
Here is a YouTube video that I followed.
https://youtu.be/Zx2zhbZLj9c
well, somehow it began to work after i ran couple of times random model with pixel script. after swiching to 768-v-ema insted of usual error it started to download 4gb idk what changed
Fixed on my end.
The error was that SD loads the first ckpt in the list (512), and I didn't have a yaml file for the 512.ckpt. Since I'm not using 512, I placed it and other ckpts without proper yaml files into a directory inside of the MODELS directory. Now, the 768.ckpt is the first seen, and now loads cleanly.
LMK if this helps!
The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml
FYI, I had the same errors, but it turned out I was trying to use the embedding I created for SD1.4, at 512x512, and when it tried to load in SD2.0 at 768x768, it threw the size mismatch error. I updated the embedding to SD2.0, now all loads and works properly.
hello mick hogan, how do you update the embedding to SD 2.0? I have the same erros
@brian-tam I had to retrain the embedding completely under 2.0. There is no converter, that I know of, to update 1.4 embedding to 2.0 embedding.
I found a method to run Stable Diffusion v2 without issues (thanks for Dot CSV notebook):
-
Clone the repo (if you have it, clone again to make a clean install):
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git cd stable-diffusion-webui
-
Download the model:
curl https://huggingface.co/stabilityai/stable-diffusion-2/resolve/main/768-v-ema.ckpt -o ./models/Stable-diffusion/768-v-ema.ckpt curl https://raw.githubusercontent.com/Stability-AI/stablediffusion/main/configs/stable-diffusion/v2-inference-v.yaml -o ./models/Stable-diffusion/768-v-ema.yaml
-
Execute the program:
# On windows .\webui-user.bat # On Linux ./webui.sh
The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml
Bless you. This solved it for me right away.
You can copy and paste it from raw just fine. For me the problem was related with the file yaml extension, as notepad++ was saving it as .yml instead of .yaml
Solution: Rename the config file to be the same as your model AND the extension to end in .yaml instead of the automatic (for me) .yml.
You can do this with a simple rename file in windows directly from your folder, windows will propt you if you really want to change the extension, say yes.
The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml
I was having the same problem, and this fixed it. The newer downloads of the .yaml file changed the file name. Once you change it to match the chkpt file (keeping the .yaml extension), it will load.
My solution was to run as administrator
Ran into the same issue. I also thought I had the yaml file saved with the proper extension. However, when I looked at "Properties" you see the file was xxx.yaml.txt and File Explorer simply did not show the second extension. To change this go to the "View" menu in the File Explorer window and make sure "File name extensions" is checked. This probably shows the second extension and now let's you change the file type properly by deleting the .txt part of the filename. Good Luck
EDIT: Oh my God... beware the Yaml Yaml!!!
okay, please help. I'm going slightly crazy here. none of the above suggestions have fixed this (it's been days... sob)
- using --ckpt to force the right ckpt to load on startup
- reinstalled automatic1111 from scratch
- sd 1.5 works fine.
- Yaml file has the same name as the ckpt
- yaml is actually a yaml file (not a text file)
- running cmd in administrator mode
contents of my webui-user.bat:
@echo off
set PYTHON="C:\Users\username\AppData\Local\Programs\Python\Python310\python.exe" set GIT= set VENV_DIR= set COMMANDLINE_ARGS= --ckpt "C:\Users\username\stable-diffusion-webui\models\Stable-diffusion\768-v-ema.ckpt" git pull call webui.bat
contents of the yaml file
model:
base_learning_rate: 1.0e-4
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
parameterization: "v"
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False # we set this to false because this is an inference only config
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
use_checkpoint: True
use_fp16: True
image_size: 32 # unused
in_channels: 4
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_head_channels: 64 # need to fix for flash-attn
use_spatial_transformer: True
use_linear_in_transformer: True
transformer_depth: 1
context_dim: 1024
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
#attn_type: "vanilla-xformers"
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
params:
freeze: True
layer: "penultimate"
One other thing worth noting here is that the filename extension for the config file needs to be .yaml
and not .yml
EDIT: Oh my God... beware the Yaml Yaml!!!
okay, please help. I'm going slightly crazy here. none of the above suggestions have fixed this (it's been days... sob)
- using --ckpt to force the right ckpt to load on startup
- reinstalled automatic1111 from scratch
- sd 1.5 works fine.
- Yaml file has the same name as the ckpt
- yaml is actually a yaml file (not a text file)
- running cmd in administrator mode
contents of my webui-user.bat:
@echo off set PYTHON="C:\Users\username\AppData\Local\Programs\Python\Python310\python.exe" set GIT= set VENV_DIR= set COMMANDLINE_ARGS= --ckpt "C:\Users\username\stable-diffusion-webui\models\Stable-diffusion\768-v-ema.ckpt" git pull call webui.bat
contents of the yaml file
model: base_learning_rate: 1.0e-4 target: ldm.models.diffusion.ddpm.LatentDiffusion params: parameterization: "v" linear_start: 0.00085 linear_end: 0.0120 num_timesteps_cond: 1 log_every_t: 200 timesteps: 1000 first_stage_key: "jpg" cond_stage_key: "txt" image_size: 64 channels: 4 cond_stage_trainable: false conditioning_key: crossattn monitor: val/loss_simple_ema scale_factor: 0.18215 use_ema: False # we set this to false because this is an inference only config unet_config: target: ldm.modules.diffusionmodules.openaimodel.UNetModel params: use_checkpoint: True use_fp16: True image_size: 32 # unused in_channels: 4 out_channels: 4 model_channels: 320 attention_resolutions: [ 4, 2, 1 ] num_res_blocks: 2 channel_mult: [ 1, 2, 4, 4 ] num_head_channels: 64 # need to fix for flash-attn use_spatial_transformer: True use_linear_in_transformer: True transformer_depth: 1 context_dim: 1024 legacy: False first_stage_config: target: ldm.models.autoencoder.AutoencoderKL params: embed_dim: 4 monitor: val/rec_loss ddconfig: #attn_type: "vanilla-xformers" double_z: true z_channels: 4 resolution: 256 in_channels: 3 out_ch: 3 ch: 128 ch_mult: - 1 - 2 - 4 - 4 num_res_blocks: 2 attn_resolutions: [] dropout: 0.0 lossconfig: target: torch.nn.Identity cond_stage_config: target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder params: freeze: True layer: "penultimate"
I am at exactly the same point. I tried everything and have been trying since the morning.
File "C:\Users\PC\Desktop\stable-diffusion-webui-1.7.0\modules\sd_disable_initialization.py", line 221, in load_state_dict original(module, state_dict, strict=strict) File "C:\Users\PC\Desktop\stable-diffusion-webui-1.7.0\venv\Lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for LatentDiffusion: size mismatch for first_stage_model.encoder.mid.attn_1.q.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.k.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.v.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.proj_out.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.q.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.k.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.v.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.proj_out.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]).
What is the solution for this, these suggestions won't solve anything.