dinov2
dinov2 copied to clipboard
size mismatch for pos_embed
class Net(nn.Module):
def __init__(self, aff_classes=36):
super(Net, self).__init__()
self.aff_classes = aff_classes
self.gap = nn.AdaptiveAvgPool2d(1)
self.gmp = nn.AdaptiveMaxPool2d(1)
# --- hyper-parameters --- #
self.aff_cam_thd = 0.6
self.part_iou_thd = 0.6
self.cel_margin = 0.5
# --- dino-vit features --- #
self.vit_feat_dim = 384
self.cluster_num = 1
self.stride = 14
self.patch = 14
self.vit_model = vits.__dict__['vit_small'](patch_size=self.patch, num_register_tokens=0)
load_pretrained_weights(self.vit_model,'https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pth', None)
The lines of code above are what I use to load DINOv2. However, I encountered the following problem, and I would like to know what might be causing it. Is there an error in the way I load the model? RuntimeError: Error(s) in loading state_dict for DinoVisionTransformer: size mismatch for pos_embed: copying a param with shape torch.Size([1, 1370, 384]) from checkpoint, the shape in current model is torch.Size([1, 257, 384]).
This solved it for me
https://github.com/facebookresearch/dinov2/issues/316#issuecomment-1816378994