Reproduction Code for DPT-based segmentation head for GeoBench
Hi! We’ve open-sourced a minimal, reproducible pipeline for DINOv3 (ViTL) + DPT segmentation on GeoBench.
The repo includes model wrapper for DINOv3 and DPT for segmentation task. The dataset wrapper is from DOFA repo.
We successfully reproduce close mIoU to the paper report on three segmentation tasks (m-NeonTree, m-nz-cattle, m-pv4ger-seg) out of six in GeoBench.
I am still not sure the cause of big gap on the other three tasks, might be preprocess pipeline.
Feedback and PRs are welcome.
The code is in Repo.
Hi ! Thank you for your interest in DINOv3. Looking quickly at the README file of your repo, I see that you mentioned Blocks extracted for DPT -4,-3,-2,-1 (not mentioned in paper) - we used 4 intermediate layers of the backbone.
Also, the numbers you mentioned from our paper were produced using a ViT-L pretrained on satellite data (and not LVD), so I think it also brings a difference. I hope this will help you to produce closer numbers for Geobench tasks
If I remember well, DPT typically uses the layers {5, 12, 18, 24} for the vit-L.