GHM_Detection icon indicating copy to clipboard operation
GHM_Detection copied to clipboard

Why use nonempty bins rather than all bins?

Open DHPO opened this issue 5 years ago • 1 comments

Why you divide weights by nonempty bins (n) rather than all bins(self.bins)?https://github.com/libuyu/GHM_Detection/blob/3647287710416c91077805d504349fb947c2e9bd/mmdetection/mmdet/core/loss/ghm_loss.py#L54 I think M is the amount of all bins in the paper. Am I missing something?

DHPO avatar Mar 14 '19 07:03 DHPO

@DHPO You are right. In the paper, we define the M as the number of all bins. And in the latest version of our code, we choose the number of valid (non-empty) bins.

Suppose that you have 100 bins, and all the examples have the same gradient norm of 0.8 (although this is impossible in practice). Then each example will get a harmonizing parameter of 1/100 according to the original equation. And when the bin number is 10000, the parameter will become 1/10000. But in these cases, we would like to use a harmonizing parameter of 1 for all examples since they should be equally treated and should not be down-weighted. And the harmonizing parameters should not depend on the bin numbers. So we think the number of valid bins is more reasonable.

Thank you for reading the code and paper so carefully.

libuyu avatar Mar 14 '19 09:03 libuyu