dinov2
dinov2 copied to clipboard
How to use Dinov2 for self-supervised training?
Hello, I would like to ask, how to use Dinov2 for self-supervised learning of unlabeled images, get pre-trained weights, and then load the obtained pre-trained weights for supervised learning with labeled data (for image classification)?
Hi team,
Thanks @zhanglaoban-kk brought this up, I have the same problem.
- For using unlabeled images
Is that like what #142 @TheoMoutakanni mentioned that the label in training/eval/testing are all just for evaluation, and all we need just set the label as something like '0' or 'np.nan', then mocking the ImageNet dataset?
- For loading pre-trained weights, I would like to use some unlabeled data to do post-pretrain based on the vanilla DINOv2 checkpoint, I'm using dinov2-vit-b14 checkpoint, and load in dinov2/configs/ssl_default_config.yaml line 2
MODEL: WEIGHTS: 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth'
However, it reports error
... File "/hpcfs/users/a1781032/default_studio/dinov2/dinov2/train/train.py", line 153, in do_train start_iter = checkpointer.resume_or_load(cfg.MODEL.WEIGHTS, resume=resume).get("iteration", -1) + 1 ... checkpoint_state_dict = checkpoint.pop("model") KeyError: 'model'
which the dinov2_vitb14_pretrain.pth
does not have key model
but the architecture keys: 'cls_token', 'pos_embed', 'mask_token', 'patch_embed.proj.weight', 'patch_embed.proj.bias', 'blocks.0.norm1.weight', 'blocks.0.norm1.bias', 'blocks.0.attn.qkv.weight', 'blocks.0.attn.qkv.bias', 'blocks.0.attn.proj.weight', 'blocks.0.attn.proj.bias' ...
How can we do to load the provided pre-train checkpoint into the model before training?
Best, Steve
Hi team,
I had some new findings. After I looked through #154 @surajyakoa explained that he/she wanted to fine-tune DINOv2 with some unlabeled data and @qasfb has explained that it requires the full checkpoint not just the backbone, and since they hadn't release them so it's not practical.
Is that correct?
Steve
Hi team,
I had some new findings. After I looked through #154 @surajyakoa explained that he/she wanted to fine-tune DINOv2 with some unlabeled data and @qasfb has explained that it requires the full checkpoint not just the backbone, and since they hadn't release them so it's not practical.
Is that correct?
Steve
Sad moment, so it is not posible to fine-tune current model from hub?
Any news here?
Any news here?
Nope for now, someone that know how to make self-distillation using the train file in the repo ?
Hi,
My advice would be to not load the weights using the checkpointer resume function, as it also expects the optimizer buffer etc. However you can launch a normal training from scratch, and load these pretrained weights before the first iteration with torch.load
and model.load_state_dict
. Make sure you both load the weights to the student and the teacher.
You may need to slightly rename the checkpoint keys: blocks.0.attn.qkv.weight
may be named something like module.backbone.blocks.0.0.attn.qkv.weight
or whatever. I don't remember the exact mapping, but it should not be hard to find if you compare the keys on both sides
You might also need to use strict=False
, because the checkpoint does not include the head weights. However, make sure you print out the load message, so that you know if the loading failed or not. That is a common source of bugs.
The only real question is: how well will the training work with a pretrained backbone and a random init head. I think it may be beneficial to freeze the backbone and train only the heads for a few thousand iterations, so it has a bit of time to fit. Then unfreeze the whole and start the real training.
Best of luck,
Tim
Hi team,
Thanks @zhanglaoban-kk brought this up, I have the same problem.
- For using unlabeled images
Is that like what #142 @TheoMoutakanni mentioned that the label in training/eval/testing are all just for evaluation, and all we need just set the label as something like '0' or 'np.nan', then mocking the ImageNet dataset?
- For loading pre-trained weights, I would like to use some unlabeled data to do post-pretrain based on the vanilla DINOv2 checkpoint, I'm using dinov2-vit-b14 checkpoint, and load in dinov2/configs/ssl_default_config.yaml line 2
MODEL: WEIGHTS: 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth'
However, it reports error
... File "/hpcfs/users/a1781032/default_studio/dinov2/dinov2/train/train.py", line 153, in do_train start_iter = checkpointer.resume_or_load(cfg.MODEL.WEIGHTS, resume=resume).get("iteration", -1) + 1 ... checkpoint_state_dict = checkpoint.pop("model") KeyError: 'model'
which the
dinov2_vitb14_pretrain.pth
does not have keymodel
but the architecture keys:'cls_token', 'pos_embed', 'mask_token', 'patch_embed.proj.weight', 'patch_embed.proj.bias', 'blocks.0.norm1.weight', 'blocks.0.norm1.bias', 'blocks.0.attn.qkv.weight', 'blocks.0.attn.qkv.bias', 'blocks.0.attn.proj.weight', 'blocks.0.attn.proj.bias' ...
How can we do to load the provided pre-train checkpoint into the model before training?
Best, Steve
Hi Steve, I have the same require as yours. I also want to train dinov2 using unlabeled images. Does it work on [imagenet] mocked dataset ? Could you offer some detail to me?
Best, Luo