CHENG XIN
CHENG XIN
A simple method is to use different sized backbones to extract features, and then directly use k-means for clustering. If everything goes smoothly, you can see the differences between backbones...
In my personal opinion, dinov2 mainly aims to obtain a universal and task independent feature extraction backbone, and specific downstream adjustments still need to be made according to the domain....
We trained many different task models, with the minimum task using 500 samples with 518px and the maximum task using 500000 samples with 518px. Generally, freezing training for about 10...
I also trained my own dataset and found that the loss decreased rapidly, then became difficult to continue decreasing, but the final model performance was still good. Suggest using freeze...
> I got nan during training, I think it is because I loaded the model as float16? I found that when training vitb, if I set qkv_bias to true, nan...