MoDi pretrain encode checkpoint mismatch model

Hi, thank you very much for this gorgeous project! I am testing the performance of Encoder task, but i meet a problem when loading the Pretrained Encoder(MoDi_encoder_f7c850_079999.pt) following readme. the error is below:


        size mismatch for convs.3.convs.1.1.mask: copying a param with shape torch.Size([128, 64, 4, 10, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 5, 10, 3]).
        size mismatch for convs.3.convs.1.1.scale: copying a param with shape torch.Size([1, 1, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 5, 1, 1]).
        size mismatch for convs.3.convs.1.1.mask: copying a param with shape torch.Size([128, 64, 4, 10, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 5, 10, 3]).
        size mismatch for convs.3.convs.1.1.scale: copying a param with shape torch.Size([1, 1, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 5, 1, 1]).
        size mismatch for convs.3.skip.1.weight: copying a param with shape torch.Size([128, 64, 4, 10, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 5, 10, 1]).
        size mismatch for convs.3.skip.1.mask: copying a param with shape torch.Size([128, 64, 4, 10, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 5, 10, 1]).
        size mismatch for convs.3.skip.1.scale: copying a param with shape torch.Size([1, 1, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 5, 1, 1]).
        size mismatch for convs.4.convs.0.0.weight: copying a param with shape torch.Size([128, 128, 4, 4, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 5, 5, 3]).
        size mismatch for convs.4.convs.0.0.mask: copying a param with shape torch.Size([128, 128, 4, 4, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 5, 5, 3]).
        size mismatch for convs.4.convs.0.0.scale: copying a param with shape torch.Size([1, 1, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 5, 1, 1]).
        size mismatch for convs.4.convs.1.1.weight: copying a param with shape torch.Size([256, 128, 2, 4, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 4, 5, 3]).
        size mismatch for convs.4.convs.1.1.mask: copying a param with shape torch.Size([256, 128, 2, 4, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 4, 5, 3]).
        size mismatch for convs.4.convs.1.1.scale: copying a param with shape torch.Size([1, 1, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 1, 1]).
        size mismatch for convs.4.skip.1.weight: copying a param with shape torch.Size([256, 128, 2, 4, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 4, 5, 1]).
        size mismatch for convs.4.skip.1.mask: copying a param with shape torch.Size([256, 128, 2, 4, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 4, 5, 1]).
        size mismatch for convs.4.skip.1.scale: copying a param with shape torch.Size([1, 1, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 1, 1]).
        size mismatch for final_conv.0.weight: copying a param with shape torch.Size([256, 257, 2, 2, 3]) from checkpoint, the shape in current model is torch.Size([256, 257, 4, 4, 3]).
        size mismatch for final_conv.0.mask: copying a param with shape torch.Size([256, 257, 2, 2, 3]) from checkpoint, the shape in current model is torch.Size([256, 257, 4, 4, 3]).
        size mismatch for final_conv.0.scale: copying a param with shape torch.Size([1, 1, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 1, 1]).
        size mismatch for final_linear.0.weight: copying a param with shape torch.Size([256, 2048]) from checkpoint, the shape in current model is torch.Size([256, 4096]).
        size mismatch for latent_predictor.linear1.weight: copying a param with shape torch.Size([7168, 4096]) from checkpoint, the shape in current model is torch.Size([7168, 5120]).

May 09 '24 08:05 ZipperDeng

Dear @ZipperDeng, Thank you for your interest in our project. I will be happy to help by the end of this month. Until then I am occupied with previous commitments. Can you bear that long?

May 10 '24 13:05 sigal-raab

Dear @ZipperDeng, Thank you for your interest in our project. I will be happy to help by the end of this month. Until then I am occupied with previous commitments. Can you bear that long?

@sigal-raab OK, thanks, I am looking forward to your reply when you are not busy.

May 10 '24 13:05 ZipperDeng

@ZipperDeng, apologies for the delay, this will take a bit longer, but we are on it!

Jun 02 '24 08:06 sigal-raab

@sigal-raab ,

thanks for your responsiblity that still remember my issue.

i found that the encoder example doesn't work with pretrained encode model from newest README https://drive.google.com/file/d/1AoyS3DCuqPNhlQfo7LkANa1TVqCLSM03/view，the error message is above.

but it works with encoder model trained by myself following train_encoder.py.

so I think there might be something wrong on pretrained encode model.

Jun 03 '24 08:06 ZipperDeng

I believe you are right. I hope to replace the trained encoder with a valid one as soon as possible.

Jun 03 '24 09:06 sigal-raab

thanks a lot,you are very welcome!

Jun 03 '24 10:06 ZipperDeng

We pushed a fix for this problem. If you still see any problems, please re-open this issue.

Jul 23 '24 11:07 sigal-raab

MoDi MoDi copied to clipboard

pretrain encode checkpoint mismatch model

MoDi
MoDi copied to clipboard