MaskDINO icon indicating copy to clipboard operation
MaskDINO copied to clipboard

Question about the segmentation branch

Open owen24819 opened this issue 3 years ago • 2 comments

Hi,

Nice work. I see that you use the highest resolution backbone feature map and encoder feature map to generate the pixel embedding map. Did you try including other feature maps with lower resolution (backbone or encoder) and find any increase in performance?

Thanks, Owen

https://github.com/IDEA-Research/MaskDINO/blob/76c8e4536ad8f01ed97f71fe47dd05518b5dbdaf/maskdino/modeling/pixel_decoder/maskdino_encoder.py#L415-L428

owen24819 avatar Jan 12 '23 15:01 owen24819

Yes. We use a 1/8 map in the encoder by default. The biggest map we use is 1/4 of the encoder (refer to our 5-scale model). It can improve the performance by around 0.5 AP.

FengLi-ust avatar Jan 15 '23 02:01 FengLi-ust

Hi, thanks for the response. I was specifically wondering if you fed multiple encoder feature maps to the segmentation head. e.g. fed the 1/4, 1/8 and 1/16 encoder maps to the segmentation head. In the code I highlighted above, it seems like it was written as if you maybe tried this.

owen24819 avatar Jan 16 '23 19:01 owen24819