DiscoBox Some questions regarding the DiscoBox paper

Some questions regarding the DiscoBox paper

Open tianyufang1958 opened this issue 1 year ago • 2 comments

Thanks for the nice work and it runs smoothly with Docker. I have some questions with paper and hope you can give me some help.

You mention both YOLACT++ and SOLO V2, but it is not clear which one you use in Figure 2, does it mean Discobox can use either of them as long as it has mask head?
In section 3.2, fi and fk represent the ROI features of pixel i, could you please clarify what ROI means? Is the mask area within the bounding box? Also what the features are? RGB values and spatial information?
For the Structured teach, Tc is the cross image potentials, does it mean the comparison of one boxing box with all other bboxes in other images with the same label?
For the self-ensembling loss, you mentioned self consistency between our task and teach networks were calculated, but I am not clear how the Lnce? Does it compare the the masks features within bounding boxes across the teacher and task network?
In structure teacher, Gibbs energy was defined with unary potentials, pair potentials and cross-image pair potentials, but in the learning section, the loss function does not have them. So question is how the learning can correlate with it and minimise the energy? Apology I am not very familiar with standard mean field.

Mar 29 '23 15:03 tianyufang1958