rcg icon indicating copy to clipboard operation
rcg copied to clipboard

the model mismatch for mege 'model.load state_dict(checkpoint['model']),when I code with viz_rcg.ipynb

Open Yisher opened this issue 7 months ago • 9 comments

Hello!thank you for your great work. I trained rdm.pth from main_rdm.py,and trained mage.pth from main_mage.py, when I want to visualize the genereation, I encount this problem: RuntimeError Traceback (most recent call last) Cell In[9], line 2 1 checkpoint = torch.load(os.path.join('output/checkpoint-last.pth'), map_location='cpu') ----> 2 model.load_state_dict(checkpoint['model'], strict=True) 3 model.cuda() 4 _ = model.eval()

RuntimeError: Error(s) in loading state_dict for MaskedGenerativeEncoderViT: size mismatch for cls_token: copying a param with shape torch.Size([1, 1, 768]) from checkpoint, the shape in current model is torch.Size([1, 1, 1024]). size mismatch for pos_embed: copying a param with shape torch.Size([1, 257, 768]) from checkpoint, the shape in current model is torch.Size([1, 257, 1024]). size mismatch for mask_token: copying a param with shape torch.Size([1, 1, 768]) from checkpoint, the shape in current model is torch.Size([1, 1, 1024]). size mismatch for decoder_pos_embed: copying a param with shape torch.Size([1, 257, 768]) from checkpoint, the shape in current model is torch.Size([1, 257, 1024]). size mismatch for decoder_pos_embed_learned: copying a param with shape torch.Size([1, 257, 768]) from checkpoint, the shape in current model is torch.Size([1, 257, 1024]). size mismatch for token_emb.word_embeddings.weight: copying a param with shape torch.Size([2025, 768]) from checkpoint, the shape in current model is torch.Size([2025, 1024]). size mismatch for token_emb.position_embeddings.weight: copying a param with shape torch.Size([257, 768]) from checkpoint, the shape in current model is torch.Size([257, 1024]). ... size mismatch for decoder_pred.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([768, 1024]). size mismatch for mlm_layer.fc.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]). size mismatch for mlm_layer.fc.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for mlm_layer.ln.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for mlm_layer.ln.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). I can't understand why ,when I come with this problem. I used my own dataset for training and use no distributed training

Yisher avatar Dec 13 '23 08:12 Yisher