Pytorch_Retinaface icon indicating copy to clipboard operation
Pytorch_Retinaface copied to clipboard

loc loss nan

Open eeewhe opened this issue 5 years ago • 4 comments
trafficstars

When I train the mobile0.25 on widerface, sometimes the loc loss is nan.

eeewhe avatar Dec 18 '19 10:12 eeewhe

https://github.com/biubug6/Pytorch_Retinaface/issues/10#issuecomment-567029509

eeewhe avatar Dec 18 '19 13:12 eeewhe

when i get this error, i print the values of loss_t, loc_t and loc_p. and i found the min(loc_t) in smooth_l1_loss is -inf, this result in inf in loss_t, but i dont know how -inf produced in loc_t?

dianxin556 avatar Jan 06 '20 09:01 dianxin556

I just discard the batch like this..., and have not made in-depth analysis to it.

eeewhe avatar Jan 06 '20 14:01 eeewhe

The error may caused bydata_augment.py and box_utils.py ,data_augment.py maybe pass gt_box with the area of zero ,and data_augment.py may pass it to calculate log() or other function.In data_augment.py.The reason is that multi gt_box matching the same anchor , after the function "match" in box_utils.py (especially in line 143 of "data_augment.py"),may lead one anchor match a new wrong gt_box but class is not set to zero because of original overlap.

ljdongysu avatar Jul 22 '21 03:07 ljdongysu