ASL In my task,it alwasy return nan,May be it is not as wide use as BCE?

Why It always return nan?here is my log (l1_loss): L1Loss() (new_loss): AsymmetricLossOptimized() (bcewithlog_loss): AsymmetricLossOptimized() (iou_loss): IOUloss() ) ) 2022-04-20 13:03:33 | INFO | yolox.core.trainer:202 - ---> start train epoch1 2022-04-20 13:03:39 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 10/646, mem: 5053Mb, iter_time: 0.555s, data_time: 0.001s, total_loss: nan, iou_loss: 2.4, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 1.498e-10, size: 480, ETA: 4:28:59 2022-04-20 13:03:45 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 20/646, mem: 5571Mb, iter_time: 0.572s, data_time: 0.001s, total_loss: nan, iou_loss: 3.1, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 5.991e-10, size: 640, ETA: 4:33:02 2022-04-20 13:03:48 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 30/646, mem: 5571Mb, iter_time: 0.324s, data_time: 0.001s, total_loss: nan, iou_loss: 3.1, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 1.348e-09, size: 384, ETA: 3:54:11 2022-04-20 13:03:52 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 40/646, mem: 5571Mb, iter_time: 0.380s, data_time: 0.000s, total_loss: nan, iou_loss: 2.3, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 2.396e-09, size: 448, ETA: 3:41:30 2022-04-20 13:03:56 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 50/646, mem: 5571Mb, iter_time: 0.442s, data_time: 0.000s, total_loss: nan, iou_loss: 2.3, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 3.744e-09, size: 512, ETA: 3:39:53 2022-04-20 13:03:59 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 60/646, mem: 5571Mb, iter_time: 0.283s, data_time: 0.001s, total_loss: nan, iou_loss: 2.7, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 5.392e-09, size: 320, ETA: 3:25:58 2022-04-20 13:04:02 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 70/646, mem: 5571Mb, iter_time: 0.275s, data_time: 0.001s, total_loss: nan, iou_loss: 2.7, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 7.339e-09, size: 448, ETA: 3:15:28 2022-04-20 13:04:05 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 80/646, mem: 5571Mb, iter_time: 0.293s, data_time: 0.001s, total_loss: nan, iou_loss: 2.4, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 9.585e-09, size: 512, ETA: 3:08:40 2022-04-20 13:04:07 | INFO | yolox.core.trainer:260 - epoch: 1/45, iter: 90/646, mem: 5571Mb, iter_time: 0.228s, data_time: 0.001s, total_loss: nan, iou_loss: 2.5, l1_loss: 0.0, conf_loss: nan, cls_loss: 0.0, lr: 1.213e-08, size: 384, ETA: 2:59:52

Apr 20 '22 05:04 MangoFF

@MangoFF, I am also having trouble to apply the loss on other task. Values of the loss are very high around 3k-5k range. (Also loss doesn't decreases much from that range). Not sure if loss is sensitive to default loss hyper parameter values or there's anything else wrong.

Jul 08 '22 03:07 dineshdaultani

I also get NaNs when using ASL on a custom multi-label classification task. Everything seemed to work fine when I tested with gamma_neg=0, gamma_pos=0 and gamma_neg=2, gamma_pos=2. However, it seems that I get NaNs as soon as I choose gamma_neg to be larger than gamma_pos. Maybe an issue with numeric stability?

Oct 28 '22 16:10 IsabelFunke

I also get NaNs when using ASL on a custom multi-label classification task. Everything seemed to work fine when I tested with gamma_neg=0, gamma_pos=0 and gamma_neg=2, gamma_pos=2. However, it seems that I get NaNs as soon as I choose gamma_neg to be larger than gamma_pos. Maybe an issue with numeric stability?

The Adam or AdamW is recommended as an optimizer, but the SGD is not recommended.

Nov 24 '22 02:11 sorrowyn

ASL ASL copied to clipboard

In my task,it alwasy return nan,May be it is not as wide use as BCE?

ASL
ASL copied to clipboard