OASIS icon indicating copy to clipboard operation
OASIS copied to clipboard

Computation of the loss reweighting

Open nicolas-dufour opened this issue 2 years ago • 1 comments

Hey, I've noticed that the weight reweighting that is used in the code doesn't match the papers equation Indeed, here, the formula is BxHxW/(num_pixels_of_class_i*num_of_non_zeros_classes_in_batch) whereas the paper showcase the following equation: BxHxW/(num_pixels_of_class_i). Did i get something wrong? If not can you clarify which one is the correct equation? Thanks

nicolas-dufour avatar Feb 15 '22 16:02 nicolas-dufour

Hi,

Thanks for pointing out the inconsistency. That's a valid point. Indeed, without the num_of_non_zeros_classes_in_batch divider, the overall scale of the loss for the batch would be approximately proportional to the number of classes with non-zero occurrence. So this divider makes sense to add to the loss to balance contributions of all batches. An alternative method could be to pass coefficients to the loss function F.cross_entropy(...., weight=coefficients), which would scale them properly. As I remember, this way underperformed compared to the current version from the code.

So yes, in our experiments we used the version which is in the released code.

Vadim

SushkoVadim avatar Feb 16 '22 14:02 SushkoVadim