dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

size mismatch for pos_embed

Open ccmCCMfk opened this issue 1 year ago • 1 comments

class Net(nn.Module):

def __init__(self, aff_classes=36):
    super(Net, self).__init__()

    self.aff_classes = aff_classes

    self.gap = nn.AdaptiveAvgPool2d(1)
    self.gmp = nn.AdaptiveMaxPool2d(1)

    # --- hyper-parameters --- #
    self.aff_cam_thd = 0.6
    self.part_iou_thd = 0.6
    self.cel_margin = 0.5

    # --- dino-vit features --- #
    self.vit_feat_dim = 384
    self.cluster_num = 1
    self.stride = 14
    self.patch = 14
    self.vit_model = vits.__dict__['vit_small'](patch_size=self.patch, num_register_tokens=0)
    load_pretrained_weights(self.vit_model,'https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pth', None)

The lines of code above are what I use to load DINOv2. However, I encountered the following problem, and I would like to know what might be causing it. Is there an error in the way I load the model? RuntimeError: Error(s) in loading state_dict for DinoVisionTransformer: size mismatch for pos_embed: copying a param with shape torch.Size([1, 1370, 384]) from checkpoint, the shape in current model is torch.Size([1, 257, 384]).

ccmCCMfk avatar Jan 09 '24 02:01 ccmCCMfk

This solved it for me

https://github.com/facebookresearch/dinov2/issues/316#issuecomment-1816378994

zshn25 avatar Jan 19 '24 16:01 zshn25