tf-eager-fasterrcnn icon indicating copy to clipboard operation
tf-eager-fasterrcnn copied to clipboard

Training losses all nan values

Open LMD93 opened this issue 4 years ago • 1 comments

Hello, I tried running the jupyter notebook script as it is for training of the model. The only change I made was to scale under train_dataset.

train_dataset = coco.CocoDataSet(
    "./COCO2017/",
    "val",
    flip_ratio=0.5,
    pad_mode="fixed",
    mean=img_mean,
    std=img_std,
    scale=(256, 512),
)

I printed out the individual losses and this is what I see.

rpn_class_loss  tf.Tensor(nan, shape=(), dtype=float32)
rpn_bbox_loss  tf.Tensor(nan, shape=(), dtype=float32)
rcnn_class_loss  tf.Tensor(0.0, shape=(), dtype=float32)
rcnn_bbox_loss  tf.Tensor(nan, shape=(), dtype=float32)

There is no error thrown, and I did not make any changes to any of the scripts. Any idea why this is happening? Thanks!

LMD93 avatar Jul 09 '20 23:07 LMD93

I am not sure what the problem is. I tried it and it ran successfully. Could you please send your program logs or screenshots?

Viredery avatar Jul 12 '20 17:07 Viredery