dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

How to use DINOv2 pretrained ViT model for Downstream Task ?

Open kaixinbear opened this issue 1 year ago • 0 comments

Thanks for your great work and it impresses me ! I wanna to have a try in my research. Specificly, I wanna to use ViT-small to replace the ImageNet pretrained backbone for monocular 3D object detection task. The parameters of. these two networks are comparable and I thought the performance of DINOv2 pretrained ViT-small would be higher. However, the result shows that the performance of DINOv2 pretrained ViT-small is 20% lower, and the loss is hard to converge. Since I have fine-tuned learning rate , whatelse can I do to make the ViT backbone avaible ?

kaixinbear avatar Apr 21 '23 03:04 kaixinbear