YOLOX Why only output the max cls score during inference?

I notice that YOLOX only output the max cls score for each anchor point during inference, why ?

Jul 03 '22 07:07 Icecream-blue-sky

One anchor matches one box in a single class.

Jul 04 '22 11:07 FateScript

One anchor matches one box in a single class.

I haven't seen the use of max-score in other anchor-free methods. Why is it better than dividing the predicted bbox of each anchor into n(num of classes) bboxes with different scores (just like in other anchor-free methods)？

Jul 04 '22 12:07 Icecream-blue-sky

One box multi-classes is widely used, e.g. RetinaNet, FCOS(acnhor free). In my opinion, the method you are talking is used in two stage detector like Faster R-CNN.

Jul 05 '22 03:07 FateScript

I mean it is one box, single-class in YOLOX. But one box mulit-classes in RetinaNet, FCOS. When inference, YOLOX will transform each bbox with multi-classes score to bbox with max score using torch.max before nms, the RetinaNet and FCOS didn't do this...

Jul 06 '22 06:07 Icecream-blue-sky

@FateScript what are scores = outputs[:, 4] * outputs[:, 5] what are pos 4 & 5 for in results, why do we multiply above

Jul 03 '23 14:07 jaideep11061982

@FateScript what are scores = outputs[:, 4] * outputs[:, 5] what are pos 4 & 5 for in results, why do we multiply above

Code here comes from a model named FCOS and it is very common in detection algorithm.

Jul 04 '23 10:07 FateScript

@FateScript what are scores = outputs[:, 4] * outputs[:, 5] what are pos 4 & 5 for in results, why do we multiply above

Code here comes from a model named FCOS and it is very common in detection algorithm. @FateScript Thanks what I want to know what pos 4 and 5 contain

Jul 05 '23 11:07 jaideep11061982