stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: Size mismatch for model.diffusion_model.output_block

Open rethink-studios opened this issue 2 years ago • 10 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

Already up to date. venv "G:\StableDiffusion\stable-diffusion-webui\venv\Scripts\Python.exe" Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Commit hash: 6074175faa751dde933aa8e15cd687ca4e4b4a23 Installing requirements for Web UI Launching Web UI with arguments: --xformers --disable-safe-unpickle --allow-code --autolaunch --theme dark --deepdanbooru LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading weights [09dd2ae4] from G:\StableDiffusion\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.ckpt Global Step: 875000 Traceback (most recent call last): File "G:\StableDiffusion\stable-diffusion-webui\launch.py", line 255, in start() File "G:\StableDiffusion\stable-diffusion-webui\launch.py", line 250, in start webui.webui() File "G:\StableDiffusion\stable-diffusion-webui\webui.py", line 152, in webui initialize() File "G:\StableDiffusion\stable-diffusion-webui\webui.py", line 86, in initialize modules.sd_models.load_model() File "G:\StableDiffusion\stable-diffusion-webui\modules\sd_models.py", line 257, in load_model load_model_weights(sd_model, checkpoint_info) File "G:\StableDiffusion\stable-diffusion-webui\modules\sd_models.py", line 188, in load_model_weights model.load_state_dict(sd, strict=False) File "G:\StableDiffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for LatentDiffusion: size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.input_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.2.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.input_blocks.2.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.input_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.input_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.input_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.input_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.input_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.input_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.input_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.middle_block.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.middle_block.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.3.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.3.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]). size mismatch for model.diffusion_model.output_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]). size mismatch for model.diffusion_model.output_blocks.6.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.6.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]). size mismatch for model.diffusion_model.output_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]). size mismatch for model.diffusion_model.output_blocks.9.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.9.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.output_blocks.10.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.10.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.output_blocks.11.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]). size mismatch for model.diffusion_model.output_blocks.11.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). Press any key to continue . . .

Steps to reproduce the problem

  1. Installed a fresh version of AUTO1111 via git pull
  2. copied the appropriate models and .yaml to the correct directories
  3. run webui-user.bat and received this error:

What should have happened?

WebUI would automatically open and I could create an image

Commit where the problem happens

6074175faa751dde933aa8e15cd687ca4e4b4a23

What platforms do you use to access UI ?

Windows

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

@echo off

set PYTHON=C:\Python310\python.exe
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--xformers --disable-safe-unpickle --allow-code --autolaunch --theme dark --deepdanbooru --vae-path "models\Stable-diffusion\vae-ft-mse-840000-ema-pruned.pt"
set CUDA_VISIBLE_DEVICES=1

git pull

call webui.bat

Additional information, context and logs

No response

rethink-studios avatar Nov 27 '22 12:11 rethink-studios

I had the same problem, I mis copied the yaml file. In the SD models directory does 768-v-ema.yaml show as a text file or a yaml file?

timbgray avatar Nov 27 '22 17:11 timbgray

yaml And is copied inside stable-diffusion-webui\models\Stable-diffusion

image

rethink-studios avatar Nov 27 '22 17:11 rethink-studios

What did you do to fix?

rethink-studios avatar Nov 27 '22 17:11 rethink-studios

I just re-saved the git raw file as 768-v-ema and got 768-v-ema.yaml rather then 768-v-ema.yaml.txt

timbgray avatar Nov 27 '22 21:11 timbgray

Same mistake

RuntimeError: Error(s) in loading state_dict for LatentDiffusion: size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]). size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).

kakaxixx avatar Nov 28 '22 06:11 kakaxixx

Where do you even get that 768-v-ema.yaml file? I cannot find it in the docs, dependencies or google. Does it get created somehow? Feeling really stupid here.

thorbenkohler avatar Nov 28 '22 06:11 thorbenkohler

Where do you even get that 768-v-ema.yaml file? I cannot find it in the docs, dependencies or google. Does it get created somehow? Feeling really stupid here.

https://github.com/Stability-AI/stablediffusion/tree/main/configs/stable-diffusion

still not working for me tho :(

Worrah avatar Nov 28 '22 11:11 Worrah

Here is a YouTube video that I followed.
https://youtu.be/Zx2zhbZLj9c

timbgray avatar Nov 28 '22 11:11 timbgray

well, somehow it began to work after i ran couple of times random model with pixel script. after swiching to 768-v-ema insted of usual error it started to download 4gb idk what changed

Worrah avatar Nov 28 '22 14:11 Worrah

Fixed on my end.

The error was that SD loads the first ckpt in the list (512), and I didn't have a yaml file for the 512.ckpt. Since I'm not using 512, I placed it and other ckpts without proper yaml files into a directory inside of the MODELS directory. Now, the 768.ckpt is the first seen, and now loads cleanly.

image

image

LMK if this helps!

rethink-studios avatar Nov 29 '22 16:11 rethink-studios

The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml

gtx155 avatar Nov 29 '22 18:11 gtx155

FYI, I had the same errors, but it turned out I was trying to use the embedding I created for SD1.4, at 512x512, and when it tried to load in SD2.0 at 768x768, it threw the size mismatch error. I updated the embedding to SD2.0, now all loads and works properly.

mickhogan avatar Dec 01 '22 15:12 mickhogan

hello mick hogan, how do you update the embedding to SD 2.0? I have the same erros

brian-tam avatar Dec 01 '22 22:12 brian-tam

@brian-tam I had to retrain the embedding completely under 2.0. There is no converter, that I know of, to update 1.4 embedding to 2.0 embedding.

mickhogan avatar Dec 04 '22 06:12 mickhogan

I found a method to run Stable Diffusion v2 without issues (thanks for Dot CSV notebook):

  • Clone the repo (if you have it, clone again to make a clean install):

    git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
    cd stable-diffusion-webui
    
  • Download the model:

    curl https://huggingface.co/stabilityai/stable-diffusion-2/resolve/main/768-v-ema.ckpt  -o ./models/Stable-diffusion/768-v-ema.ckpt
    curl https://raw.githubusercontent.com/Stability-AI/stablediffusion/main/configs/stable-diffusion/v2-inference-v.yaml  -o ./models/Stable-diffusion/768-v-ema.yaml
    
  • Execute the program:

    # On windows
    .\webui-user.bat
    
    # On Linux
    ./webui.sh
    

sleep-written avatar Dec 04 '22 21:12 sleep-written

The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml

Bless you. This solved it for me right away.

mindfu23 avatar Dec 17 '22 07:12 mindfu23

You can copy and paste it from raw just fine. For me the problem was related with the file yaml extension, as notepad++ was saving it as .yml instead of .yaml

Solution: Rename the config file to be the same as your model AND the extension to end in .yaml instead of the automatic (for me) .yml.

You can do this with a simple rename file in windows directly from your folder, windows will propt you if you really want to change the extension, say yes.

jmsassuncao avatar Dec 17 '22 18:12 jmsassuncao

The names of the model and the yaml config should match. So if the name of the model is v2-768-v-ema.ckpt, the yaml file should be v2-768-v-ema.yaml

I was having the same problem, and this fixed it. The newer downloads of the .yaml file changed the file name. Once you change it to match the chkpt file (keeping the .yaml extension), it will load.

lekjaz avatar Dec 20 '22 08:12 lekjaz

My solution was to run as administrator

FortuProject avatar Dec 28 '22 04:12 FortuProject

Ran into the same issue. I also thought I had the yaml file saved with the proper extension. However, when I looked at "Properties" you see the file was xxx.yaml.txt and File Explorer simply did not show the second extension. To change this go to the "View" menu in the File Explorer window and make sure "File name extensions" is checked. This probably shows the second extension and now let's you change the file type properly by deleting the .txt part of the filename. Good Luck

image

ftw-tech avatar Dec 28 '22 20:12 ftw-tech

EDIT: Oh my God... beware the Yaml Yaml!!!

image

okay, please help. I'm going slightly crazy here. none of the above suggestions have fixed this (it's been days... sob)

  • using --ckpt to force the right ckpt to load on startup
  • reinstalled automatic1111 from scratch
  • sd 1.5 works fine.
  • Yaml file has the same name as the ckpt
  • yaml is actually a yaml file (not a text file)
  • running cmd in administrator mode

image

contents of my webui-user.bat:

@echo off

set PYTHON="C:\Users\username\AppData\Local\Programs\Python\Python310\python.exe" set GIT= set VENV_DIR= set COMMANDLINE_ARGS= --ckpt "C:\Users\username\stable-diffusion-webui\models\Stable-diffusion\768-v-ema.ckpt" git pull call webui.bat

contents of the yaml file

model:
  base_learning_rate: 1.0e-4
  target: ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    parameterization: "v"
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False # we set this to false because this is an inference only config

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        use_checkpoint: True
        use_fp16: True
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_head_channels: 64 # need to fix for flash-attn
        use_spatial_transformer: True
        use_linear_in_transformer: True
        transformer_depth: 1
        context_dim: 1024
        legacy: False

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          #attn_type: "vanilla-xformers"
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
      params:
        freeze: True
        layer: "penultimate"

muunkky avatar Dec 29 '22 04:12 muunkky

One other thing worth noting here is that the filename extension for the config file needs to be .yaml and not .yml

designerjason avatar Dec 30 '22 12:12 designerjason

EDIT: Oh my God... beware the Yaml Yaml!!!

image

okay, please help. I'm going slightly crazy here. none of the above suggestions have fixed this (it's been days... sob)

  • using --ckpt to force the right ckpt to load on startup
  • reinstalled automatic1111 from scratch
  • sd 1.5 works fine.
  • Yaml file has the same name as the ckpt
  • yaml is actually a yaml file (not a text file)
  • running cmd in administrator mode

image

contents of my webui-user.bat:

@echo off set PYTHON="C:\Users\username\AppData\Local\Programs\Python\Python310\python.exe" set GIT= set VENV_DIR= set COMMANDLINE_ARGS= --ckpt "C:\Users\username\stable-diffusion-webui\models\Stable-diffusion\768-v-ema.ckpt" git pull call webui.bat

contents of the yaml file

model:
  base_learning_rate: 1.0e-4
  target: ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    parameterization: "v"
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False # we set this to false because this is an inference only config

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        use_checkpoint: True
        use_fp16: True
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_head_channels: 64 # need to fix for flash-attn
        use_spatial_transformer: True
        use_linear_in_transformer: True
        transformer_depth: 1
        context_dim: 1024
        legacy: False

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          #attn_type: "vanilla-xformers"
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
      params:
        freeze: True
        layer: "penultimate"

I am at exactly the same point. I tried everything and have been trying since the morning.

File "C:\Users\PC\Desktop\stable-diffusion-webui-1.7.0\modules\sd_disable_initialization.py", line 221, in load_state_dict original(module, state_dict, strict=strict) File "C:\Users\PC\Desktop\stable-diffusion-webui-1.7.0\venv\Lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for LatentDiffusion: size mismatch for first_stage_model.encoder.mid.attn_1.q.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.k.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.v.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.encoder.mid.attn_1.proj_out.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.q.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.k.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.v.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for first_stage_model.decoder.mid.attn_1.proj_out.bias: copying a param with shape torch.Size([512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]).

What is the solution for this, these suggestions won't solve anything.

toprakfirat avatar Dec 20 '23 19:12 toprakfirat