stablediffusion Would be possible to use another Open Clip arch?

Would be possible to use another Open Clip arch?

Open Mateusmsouza opened this issue 1 year ago • 0 comments

I noticed that OpenClip version used is ViT-H-14 laion2b_s32b_b79k by default, I tried to use another version (ViT-B-32 laion2b_s34b_b79k) and I got errors on models weight:

RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
        size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying 
a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 512]).        size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight: copying

Here is how I changed the config.yaml:

# configs/stable-diffusion/v2-inference-v.yaml
unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        ...
        context_dim: 512 # only change I made from 1024 to 512
        ...

cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
      params:
        freeze: True
        layer: "penultimate"
        # my changes below
        arch: "ViT-B-32" 
        version: "laion2b_s34b_b79k"

Is it possible to use another version of openclip on weights of SD2 just by changing some configs on the yaml, or is it just not doable?

Apr 06 '23 14:04 Mateusmsouza

stablediffusion stablediffusion copied to clipboard

Would be possible to use another Open Clip arch?

stablediffusion
stablediffusion copied to clipboard