MaskDINO The implementation of Hybrid Matching

Thanks for the great work.

I have read the paper and I think there is not enough detail about the implementation of hybrid matching. Or if there is, I could not understand it. Is it possible to elaborate it more, will it be analyzed in detail in the second version of the paper?

Thanks in advance Esat

Jun 13 '22 14:06 artest08

Hi, the hybrid matching is to match with classification loss, mask loss, and box loss (add the three losses together). Therefore, it is simple and we do not elaborate on it. We also provide the experimental analysis of different matching strategies, you can refer to the ablation study for more details. Thank you.

Jun 14 '22 11:06 FengLi-ust

Are mixing weights for box and mask matching matrices kept equal? It's not very clear from Table 12

Jul 08 '22 14:07 vadimkantorov

Hey, we present these details in the appendix.

Jul 09 '22 01:07 FengLi-ust

I did check the appendix of MaskDINO (before posting the comment and another time now), it has no mention of weights used for composing the final matching matrix. In fact, I could not find any discussion of hybrid matching in the appendix.

The only discussion of hybrid matching in the paper is 4.3 Ablation studies which just says: Matching. In Table 12, we show that only using boxes or masks to perform bipartite matching is not optimal in Mask DINO. A unified matching objective makes the optimization more consistent.. And Table 12 does not mention actual weights used for composing the final matching matrix, hence my question. I assume weights=1 were used, but it would be nice to have an explicit confirmation.

For example, here is this matching matrix construction from DeformableDETR codebase: https://github.com/fundamentalvision/Deformable-DETR/blob/main/models/matcher.py#L91. I'm wondering what are these cost_* weights in your case.

Jul 09 '22 08:07 vadimkantorov

Sorry for the unclear description. In the appendix, we provide the loss function weights of different losses. The matching cost is the same as the loss weights.

Jul 09 '22 08:07 FengLi-ust

Oh, that's interesting, because in DeformableDETR they are not the same at all. E.g. bbox_loss_coef==5, while cost_bbox==1

Jul 09 '22 08:07 vadimkantorov