stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

Older diffusion models won't work with current txt2img script

Open SimonInParis opened this issue 2 years ago • 4 comments

Older diffusion models won't work anymore, for example using models/ldm/text2img256/config.yaml and its corresponding checkpoint. Using the text2img256 model checkpoint, the command line was: python3 scripts/txt2img.py --ckpt="model.ckpt" --config="models/ldm/text2img256/config.yaml"

Leading to a tensor mismatch:

  File "/home/simon/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [192, 3, 3, 3], expected input[6, 4, 64, 64] to have 3 channels, but got 4 channels instead

Probably a batch size extra dimension?

SimonInParis avatar Aug 16 '22 18:08 SimonInParis

happens same to me with the recent v.1.4 model and command:

python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
        size mismatch for first_stage_model.post_quant_conv.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([4]).

7flash avatar Sep 03 '22 07:09 7flash

image

Hangs out at this moment when loading config

7flash avatar Sep 03 '22 08:09 7flash

Using this config instead, works for me:

https://github.com/basujindal/stable-diffusion/blob/main/optimizedSD/v1-inference.yaml

7flash avatar Sep 04 '22 09:09 7flash

Change shape in txt2img.py to shape = [3, opt.H//8, opt.W//8]. But seems like default weights are badly pretrained as they result in colored squares without any images.

trofimovaolga avatar Mar 15 '24 12:03 trofimovaolga