rendezvous icon indicating copy to clipboard operation
rendezvous copied to clipboard

Weights not matched

Open gshuangchun opened this issue 2 years ago • 1 comments

The link to low res weights for the pytorch Cholect45 (crossval k1) seems to not match the model (https://s3.unistra.fr/camma_public/github/rendezvous/rendezvous_l8_cholect45_crossval_k1_layernorm_lowres.pth):

size mismatch for decoder.mhma.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).

gshuangchun avatar Oct 22 '23 09:10 gshuangchun

同问

Esther-qian avatar Dec 20 '23 06:12 Esther-qian

the same question

sabinakaminska95 avatar Jun 10 '24 14:06 sabinakaminska95

I solved it, add --use_ln because you are using layernorm

sabinakaminska95 avatar Jun 11 '24 08:06 sabinakaminska95

Dear user,

Information about matching the right weights and models is provided in the README.md file. The weight filenames are descriptive about their respective configs.

Thanks

nwoyecid avatar Jun 11 '24 13:06 nwoyecid