dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

Semantic segmentation

Open anshkumar opened this issue 1 year ago • 9 comments

I'm not able to find code for Semantic segmentation. In the paper it's written that:

 a linear layer is trained to predict class logits from a patch tokens. It is used to produce a low-
resolution logit map (eg 32x32 for a model with patch size 16), which is then upsampled to full resolution
(512x512) to obtain a segmentation map. 

Does this mean a Linear layer with 32*32 = 1024 output classes need to be trained? What about n_last_blocks_list = [1, 4] and n_last_blocks = max(n_last_blocks_list) ? Does that need to be changed to n_last_blocks_list = [1, 1] and n_last_blocks = max(n_last_blocks_list) ?

Is there any sample code for semantic segmentation ?

anshkumar avatar Apr 19 '23 09:04 anshkumar