dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

ConvNext

Open SimJeg opened this issue 1 year ago • 1 comments

Hello,

Have you considered using the ConvNext architecture for training DINOv2?

ConvNext has shown to have improved performance and lower latency on tasks such as CLIP. For example, in the open_clip repository, ConvNext-L@320 achieves better results with a +1.4% increase in zero-shot accuracy and is more than twice as fast as ViT-L@336.

While ViT may be easier to use in a "tokenize everything for my transformer world" approach, it's worth considering that CNNs still deserve... attention ^^

Best, Simon

SimJeg avatar Apr 18 '23 09:04 SimJeg