lightly icon indicating copy to clipboard operation
lightly copied to clipboard

DinoV3 support?

Open 201power opened this issue 3 months ago • 7 comments

Hi there, can I use lightly with DINOv3? want to train yolo11 backbone with DINOv3 head and SSL

201power avatar Sep 11 '25 01:09 201power

Hey @201power !

We are not offering DINOv3 pretraining at this point, but the pretrained checkpoints from Meta are already super performant.

Just out of curiosity: Why would you like to integrate the DINOv3 ViT in a YOLO model? Don't you think you might get better performance/compatibility with a DETR variant instead?

liopeer avatar Sep 11 '25 07:09 liopeer

sorry, I mean to train yolo11n backbone + DINOv3 head with SSL. I am experimenting with yolo11n backbone + DINOv2 head with SSL right now with lightly.

my application is real time and inference time is critical, it looks like yolo11 perform better compared to DETR with real time application (small model). https://docs.ultralytics.com/compare/rtdetr-vs-yolo11/

201power avatar Sep 11 '25 07:09 201power

I am a bit confused: Are you trying to just pretrain your YOLO backbone with SSL and you will afterwards attach the YOLO head again? Or do you keep the DINOv2 head afterwards? If you keep it, what is the downstream task you would like to solve?

my application is real time and inference time is critical, it looks like yolo11 perform better compared to DETR with real time application (small model). https://docs.ultralytics.com/compare/rtdetr-vs-yolo11/

One issue with these benchmarks is that oftentimes NMS (non max suppression) is not included in the benchmarks for the YOLO models, but without it the result is unusable. So I would recommend checking the RT-DETR paper or run a few benchmarks of your own (with NMS enabled).

liopeer avatar Sep 11 '25 11:09 liopeer

Here is my plan

  1. train yolo backboone + DINOv2 head with SSL to create pretrain A
  2. train yolo full model with pretrain A for detection

Thanks for the pointer, I'll test RT-DETR

201power avatar Sep 11 '25 11:09 201power

Yes, I would definitely give RT-DETR a shot. Another thing I can recommend if you haven't already checked it out is our other repo LightlyTrain: https://docs.lightly.ai/train/stable/index.html

LightlyTrain supports pretraining YOLO models out of the box with just 5 lines of code. You can also use it to do knowledge distillation from DINOv2 or DINOv3 into your YOLO models which is oftentimes easier to get good performance with than with "true" SSL. You can check this issue to see info on how you can get distillation from DINOv3 going.

liopeer avatar Sep 12 '25 07:09 liopeer

Hey @201power !

We are not offering DINOv3 pretraining at this point, but the pretrained checkpoints from Meta are already super performant.

Just out of curiosity: Why would you like to integrate the DINOv3 ViT in a YOLO model? Don't you think you might get better performance/compatibility with a DETR variant instead?

I have not tested the dinov3 pretrained model but would love to be able to have the ability to pretrain it on custom specialized scientific datasets.

MosGeo avatar Sep 13 '25 17:09 MosGeo

I have not tested the dinov3 pretrained model but would love to be able to have the ability to pretrained it on custom specialized scientific datasets.

Got it! We definitely plan on supporting DINOv3 pretraining (or at least parts of the training stages) in the mid-term future. But I think we'll only get to work on it towards the end of the year.

liopeer avatar Sep 14 '25 10:09 liopeer