DocEnTR icon indicating copy to clipboard operation
DocEnTR copied to clipboard

Unable to load the state_dict for BinModel in demo example

Open glgaines opened this issue 1 year ago • 5 comments

Working directly with the example code in https://github.com/dali92002/DocEnTR/blob/main/demo.ipynb

Tried the pretrained model params from the model zoo. when model.load_state_dict - Ran into the following error:

RuntimeError: Error(s) in loading state_dict for BinModel: Missing key(s) in state_dict: "encoder.to_patch_embedding.2.weight", "encoder.to_patch_embedding.2.bias", "encoder.to_patch_embedding.3.weight", "encoder.to_patch_embedding.3.bias".

size mismatch for encoder.pos_embedding: copying a param with shape torch.Size([1, 1025, 768]) from checkpoint, the shape in current model is torch.Size([1, 257, 768]).

size mismatch for encoder.to_patch_embedding.1.weight: copying a param with shape torch.Size([768, 192]) from checkpoint, the shape in current model is torch.Size([768]).

size mismatch for patch_to_emb.weight: copying a param with shape torch.Size([768, 192]) from checkpoint, the shape in current model is torch.Size([768]).

size mismatch for decoder_pos_emb.weight: copying a param with shape torch.Size([1025, 768]) from checkpoint, the shape in current model is torch.Size([257, 768]).

size mismatch for to_pixels.weight: copying a param with shape torch.Size([192, 768]) from checkpoint, the shape in current model is torch.Size([768, 768]).

size mismatch for to_pixels.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([768]).

glgaines avatar May 12 '23 00:05 glgaines

same problem

daisyhaohao avatar May 30 '23 04:05 daisyhaohao

try using python version 3.8.12

masrur-ahmed avatar Jun 08 '23 00:06 masrur-ahmed

You should be careful in using the right model, vit_model_size and vit_patch_size

dali92002 avatar Feb 26 '24 16:02 dali92002

same problem, anyone have the solution for that :( ?

vm7608 avatar Apr 19 '24 15:04 vm7608

I have found the following solution:

  • Specify package version: vit-pytorch==0.24.3 and einops==0.3.2
  • Choose the right pre trained ckpt and set the right value for setting and patch_size

vm7608 avatar Apr 20 '24 03:04 vm7608