Mask_RCNN icon indicating copy to clipboard operation
Mask_RCNN copied to clipboard

step - loss: nan - val_loss: nan in the trainin

Open gelsonwj opened this issue 1 year ago • 2 comments

Hello, I'm trying to train the mask rcnn network with multi classes, but I only get step - loss: nan - val_loss: nan in all training epochs. I've already used different values ​​for different parameters, I don't know what else to do. Does anyone have any ideas?

`Configurations: BACKBONE resnet101 BACKBONE_STRIDES [4, 8, 16, 32, 64] BATCH_SIZE 1 BBOX_STD_DEV [0.1 0.1 0.2 0.2] COMPUTE_BACKBONE_SHAPE None DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.7 DETECTION_NMS_THRESHOLD 0.3

FPN_CLASSIF_FC_LAYERS_SIZE 1024 GPU_COUNT 1 GRADIENT_CLIP_NORM 1 IMAGES_PER_GPU 1 IMAGE_CHANNEL_COUNT 3 IMAGE_MAX_DIM 640 IMAGE_META_SIZE 18 IMAGE_MIN_DIM 640 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE square IMAGE_SHAPE [640 640 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 1e-05 LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] MAX_GT_INSTANCES 50 MEAN_PIXEL [123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME weedS_detection NUM_CLASSES 6 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 500 POST_NMS_ROIS_TRAINING 1000 PRE_NMS_LIMIT 6000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.5, 1, 2] RPN_ANCHOR_SCALES (8, 16, 32, 64, 128) RPN_ANCHOR_STRIDE [1] RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.7 RPN_TRAIN_ANCHORS_PER_IMAGE 128 STEPS_PER_EPOCH 100 TOP_DOWN_PYRAMID_SIZE 256 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 32 USE_MINI_MASK False USE_RPN_ROIS True VALIDATION_STEPS 50 WEIGHT_DECAY 0.0001` boxes masks output

gelsonwj avatar Mar 11 '23 20:03 gelsonwj

i got the same error while training the model with gpu. what's the solution?

ridhoaanhrp avatar Jun 04 '23 13:06 ridhoaanhrp

Check your annotations whether there is undefined values or images that are not annotated at all...it happens when there are missing annotations.

dayana123456789 avatar Sep 13 '23 08:09 dayana123456789