ComfyUI-LTXVideo icon indicating copy to clipboard operation
ComfyUI-LTXVideo copied to clipboard

CheckpointLoaderSimple Error(s) in loading state_dict for VideoVAE: size mismatch for encoder.down_blocks.6.res_blocks.0.conv1.conv.weight:

Open ABIA2024 opened this issue 8 months ago • 0 comments

the workflow exemple ITV works with the checkpoint ILTX-video-...V1.9 but the exactly SAME WORKFLOW doesn't work whith V0.9.5 and give this LOG beginning like that: CheckpointLoaderSimple Error(s) in loading state_dict for VideoVAE: size mismatch for encoder.down_blocks.6.res_blocks.0.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). WHAT CAN I DO ? the completely LOG; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\execution.py", line 169, in _map_node_over_list process_inputs(input_dict, i) File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\execution.py", line 158, in process_inputs results.append(getattr(obj, func)(**inputs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\nodes.py", line 553, in load_checkpoint out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 712, in load_checkpoint_guess_config out = load_state_dict_guess_config(sd, output_vae, output_clip, output_clipvision, embedding_directory, output_model, model_options, te_model_options=te_model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 758, in load_state_dict_guess_config vae = VAE(sd=vae_sd) ^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 355, in init m, u = self.first_stage_model.load_state_dict(sd, strict=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Comfyui\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 2584, in load_state_dict raise RuntimeError( RuntimeError: Error(s) in loading state_dict for VideoVAE: size mismatch for encoder.down_blocks.6.res_blocks.0.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.6.res_blocks.0.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.6.res_blocks.0.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.6.res_blocks.0.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.6.res_blocks.1.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.6.res_blocks.1.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.6.res_blocks.1.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.6.res_blocks.1.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.8.res_blocks.0.conv1.conv.weight: copying a param with shape torch.Size([2048, 2048, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.8.res_blocks.0.conv1.conv.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.8.res_blocks.0.conv2.conv.weight: copying a param with shape torch.Size([2048, 2048, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.8.res_blocks.0.conv2.conv.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.8.res_blocks.1.conv1.conv.weight: copying a param with shape torch.Size([2048, 2048, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.8.res_blocks.1.conv1.conv.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.down_blocks.8.res_blocks.1.conv2.conv.weight: copying a param with shape torch.Size([2048, 2048, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for encoder.down_blocks.8.res_blocks.1.conv2.conv.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.conv_out.conv.weight: copying a param with shape torch.Size([129, 2048, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([129, 512, 3, 3, 3]). size mismatch for decoder.conv_in.conv.weight: copying a param with shape torch.Size([1024, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3, 3]). size mismatch for decoder.conv_in.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.0.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.0.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.0.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.0.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.1.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.1.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.1.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.1.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.2.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.2.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.2.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.2.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.3.conv1.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.3.conv1.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.0.res_blocks.3.conv2.conv.weight: copying a param with shape torch.Size([1024, 1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3, 3]). size mismatch for decoder.up_blocks.0.res_blocks.3.conv2.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for decoder.up_blocks.5.conv.conv.weight: copying a param with shape torch.Size([1024, 256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([2048, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.5.conv.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for decoder.up_blocks.6.res_blocks.0.conv1.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.0.conv1.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for decoder.up_blocks.6.res_blocks.0.conv2.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.0.conv2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for decoder.up_blocks.6.res_blocks.1.conv1.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.1.conv1.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for decoder.up_blocks.6.res_blocks.1.conv2.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.1.conv2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for decoder.up_blocks.6.res_blocks.2.conv1.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.2.conv1.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for decoder.up_blocks.6.res_blocks.2.conv2.conv.weight: copying a param with shape torch.Size([128, 128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3, 3]). size mismatch for decoder.up_blocks.6.res_blocks.2.conv2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).

2025-04-16T10:25:13.609168 - Prompt executed in 17.07 seconds

## Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.

{"last_node_id":96,"last_link_id":252,"nodes":[{"id":38,"type":"CLIPLoader","pos":[60,190],"size":[315,98],"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"CLIP","type":"CLIP","links":[74,75],"slot_index":0}],"properties":{"Node name for S&R":"CLIPLoader"},"widgets_values":["t5xxl_fp16.safetensors","ltxv"]},{"id":76,"type":"Note","pos":[40,350],"size":[360,200],"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[],"properties":{},"widgets_values":["This model needs long descriptive prompts, if the prompt is too short the quality will suffer greatly."],"color":"#432","bgcolor":"#653"},{"id":71,"type":"LTXVScheduler","pos":[880,290],"size":[315,154],"flags":{},"order":11,"mode":0,"inputs":[{"name":"latent","type":"LATENT","link":249,"shape":7}],"outputs":[{"name":"SIGMAS","type":"SIGMAS","links":[182],"slot_index":0}],"properties":{"Node name for S&R":"LTXVScheduler"},"widgets_values":[30,2.05,0.95,true,0.1]},{"id":73,"type":"KSamplerSelect","pos":[880,190],"size":[315,58],"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"SAMPLER","type":"SAMPLER","links":[172]}],"properties":{"Node name for S&R":"KSamplerSelect"},"widgets_values":["euler"]},{"id":8,"type":"VAEDecode","pos":[1740,30],"size":[210,46],"flags":{},"order":13,"mode":0,"inputs":[{"name":"samples","type":"LATENT","link":235},{"name":"vae","type":"VAE","link":87}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[106],"slot_index":0}],"properties":{"Node name for S&R":"VAEDecode"},"widgets_values":[]},{"id":41,"type":"SaveAnimatedWEBP","pos":[1970,30],"size":[493.98468017578125,481.28692626953125],"flags":{},"order":14,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":106}],"outputs":[],"properties":{"Node name for S&R":"SaveAnimatedWEBP"},"widgets_values":["ComfyUI",24,false,90,"default",null]},{"id":69,"type":"LTXVConditioning","pos":[920,60],"size":[223.8660125732422,78],"flags":{},"order":10,"mode":0,"inputs":[{"name":"positive","type":"CONDITIONING","link":245},{"name":"negative","type":"CONDITIONING","link":246}],"outputs":[{"name":"positive","type":"CONDITIONING","links":[199],"slot_index":0},{"name":"negative","type":"CONDITIONING","links":[167],"slot_index":1}],"properties":{"Node name for S&R":"LTXVConditioning"},"widgets_values":[25]},{"id":6,"type":"CLIPTextEncode","pos":[420,180],"size":[422.84503173828125,164.31304931640625],"flags":{},"order":6,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":74}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[239],"slot_index":0}],"title":"CLIP Text Encode (Positive Prompt)","properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["A red fox moving gracefully, its russet coat vibrant against the white landscape, leaving perfect star-shaped prints behind as steam rises from its breath in the crisp winter air. The scene is wrapped in snow-muffled silence, broken only by the gentle murmur of water still flowing beneath the ice."],"color":"#232","bgcolor":"#353"},{"id":7,"type":"CLIPTextEncode","pos":[420,390],"size":[425.27801513671875,180.6060791015625],"flags":{},"order":7,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":75}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[240],"slot_index":0}],"title":"CLIP Text Encode (Negative Prompt)","properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["low quality, worst quality, deformed, distorted, disfigured, motion smear, motion artifacts, fused fingers, bad anatomy, weird hand, ugly"],"color":"#322","bgcolor":"#533"},{"id":95,"type":"LTXVImgToVideo","pos":[888.8251342773438,608.7010498046875],"size":[315,214],"flags":{},"order":9,"mode":0,"inputs":[{"name":"positive","type":"CONDITIONING","link":239},{"name":"negative","type":"CONDITIONING","link":240},{"name":"vae","type":"VAE","link":250},{"name":"image","type":"IMAGE","link":252}],"outputs":[{"name":"positive","type":"CONDITIONING","links":[245],"slot_index":0},{"name":"negative","type":"CONDITIONING","links":[246],"slot_index":1},{"name":"latent","type":"LATENT","links":[247,249],"slot_index":2}],"properties":{"Node name for S&R":"LTXVImgToVideo"},"widgets_values":[768,512,97,1,0.15]},{"id":72,"type":"SamplerCustom","pos":[1201,32],"size":[355.20001220703125,230],"flags":{},"order":12,"mode":0,"inputs":[{"name":"model","type":"MODEL","link":181},{"name":"positive","type":"CONDITIONING","link":199},{"name":"negative","type":"CONDITIONING","link":167},{"name":"sampler","type":"SAMPLER","link":172},{"name":"sigmas","type":"SIGMAS","link":182},{"name":"latent_image","type":"LATENT","link":247}],"outputs":[{"name":"output","type":"LATENT","links":[235],"slot_index":0},{"name":"denoised_output","type":"LATENT","links":null}],"properties":{"Node name for S&R":"SamplerCustom"},"widgets_values":[true,338582498404775,"randomize",3]},{"id":96,"type":"ttN imageOutput","pos":[481.1054382324219,644.130859375],"size":[315,270],"flags":{},"order":8,"mode":0,"inputs":[{"name":"image","type":"IMAGE","link":251}],"outputs":[{"name":"image","type":"IMAGE","links":[252],"slot_index":0}],"properties":{"Node name for S&R":"ttN imageOutput"},"widgets_values":["Preview","C:\Comfyui\ComfyUI_windows_portable\ComfyUI\output","ComfyUI",5,"png",false,true]},{"id":82,"type":"LTXVPreprocess","pos":[570,670],"size":[275.9266662597656,58],"flags":{},"order":7,"mode":0,"inputs":[{"name":"image","type":"IMAGE","link":226}],"outputs":[{"name":"output_image","type":"IMAGE","links":[248],"slot_index":0}],"properties":{"Node name for S&R":"LTXVPreprocess"},"widgets_values":[40]},{"id":78,"type":"LoadImage","pos":[3.355452060699463,654.8484497070312],"size":[385.15606689453125,333.3305358886719],"flags":{},"order":4,"mode":0,"inputs":[],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[251],"slot_index":0},{"name":"MASK","type":"MASK","links":null}],"properties":{"Node name for S&R":"LoadImage"},"widgets_values":["fox.jpg","image"]},{"id":44,"type":"CheckpointLoaderSimple","pos":[520,30],"size":[315,98],"flags":{},"order":5,"mode":0,"inputs":[],"outputs":[{"name":"MODEL","type":"MODEL","links":[181],"slot_index":0},{"name":"CLIP","type":"CLIP","links":null},{"name":"VAE","type":"VAE","links":[87,250],"slot_index":2}],"properties":{"Node name for S&R":"CheckpointLoaderSimple"},"widgets_values":["ltx-video-2b-v0.9.5.safetensors"]}],"links":[[74,38,0,6,0,"CLIP"],[75,38,0,7,0,"CLIP"],[87,44,2,8,1,"VAE"],[106,8,0,41,0,"IMAGE"],[167,69,1,72,2,"CONDITIONING"],[172,73,0,72,3,"SAMPLER"],[181,44,0,72,0,"MODEL"],[182,71,0,72,4,"SIGMAS"],[199,69,0,72,1,"CONDITIONING"],[235,72,0,8,0,"LATENT"],[239,6,0,95,0,"CONDITIONING"],[240,7,0,95,1,"CONDITIONING"],[245,95,0,69,0,"CONDITIONING"],[246,95,1,69,1,"CONDITIONING"],[247,95,2,72,5,"LATENT"],[249,95,2,71,0,"LATENT"],[250,44,2,95,2,"VAE"],[251,78,0,96,0,"IMAGE"],[252,96,0,95,3,"IMAGE"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.5289271573991922,"offset":[-13.57877189135429,186.3052875245694]},"ue_links":[]},"version":0.4}


## Additional Context
(Please add any additional context or steps to reproduce the error here)

ABIA2024 avatar Apr 16 '25 08:04 ABIA2024