Mask_RCNN icon indicating copy to clipboard operation
Mask_RCNN copied to clipboard

Handle zero pad & Allow bg to train RPN

Open keineahnung2345 opened this issue 5 years ago • 24 comments

Handle zero pad:

  • fpn_classifier_graph: first use target_class_ids to find out the samples which are zero padding and finally zero out the result
  • mrcnn_class_loss_graph: use target_class_ids to find out the samples which are zero padding and finally remove them from the loss
  • add comments to detection_targets_graph, fpn_classifier_graph

Allow background images to train RPN

  • build_rpn_targets: generate negative anchors for sample that doesn't have instances
  • generate_random_rois: generate random rois for sample that doesn't have instances
  • data_generator: allow images without any instances to train rpn(these kind of images will contribute to rpn_class_loss only)

keineahnung2345 avatar Oct 29 '18 14:10 keineahnung2345

Wow, I totally missed that Mask_RCNN skips images without instances. Even though I saw the balloon example where they delete those samples. I thought it was only in that example for some reason.

antimirov avatar Nov 25 '18 22:11 antimirov

Yes, I hope this PR could be reviewed ASAP. @waleedka

keineahnung2345 avatar Nov 27 '18 01:11 keineahnung2345

With your code is that enough to include photos with an empty polygons list and an empty string class name in the balloon.py file (or equivalent) so that the photo is taken into account in the loss, or do you need other modification to use BG picture. My model has struggle with false positive and I'm convince i could do way better if i could include full background picture. I don't know if i'm clear enough.

pmaksud avatar Nov 28 '18 18:11 pmaksud

@pmaksud It should work fine with bg image input. BG images only contribute to rpn_class_loss and nothing else.

keineahnung2345 avatar Nov 29 '18 02:11 keineahnung2345

@keineahnung2345 Hi, thank you for the pull request. I am working on remote sensing images. Including the bg to train is important for the final results. I merged this pull request to my fork version. The training loss seemed good. However, when I performed inference used the trained model, the code predicted 100 instances on each image. Before I merging this pull request, each image only had a few or zero instances, which made sense. I think the number of "100" came from "max number of final detections" (was set as 100) in config.py. I used a trained model from the training before merging this pull request, it still outputted reasonable results. So the code of inference is good. Could you check the code? Thank you!

Best, Lingcao

yghlc avatar Dec 01 '18 04:12 yghlc

In my case, i included some bg images in my train and test set simultaneously and i tried to overfit them but i still get false postive which is very strange. Do you think i should increase the weight of the rpn class loss or something like this?

pmaksud avatar Dec 02 '18 09:12 pmaksud

Since MaskRCNN is composed of many different parts, so it's hard to say which part of the entire model goes wrong. There is another issue discussing the interaction of RPN, ProposalLayer and DetectionTargetLayer: https://github.com/matterport/Mask_RCNN/issues/726, but we need more evidences. One can use the revised run_graph: https://github.com/matterport/Mask_RCNN/pull/1065 to check output of each stage in training mode, so more detailed analysis will be possible.

keineahnung2345 avatar Dec 03 '18 00:12 keineahnung2345

Is there any reason this hasn't been merged into the main branch? Seems like a useful addition, I'm assuming many domains would benefit from additional negative training samples.

Though in running this I'm then checking out the size of my dataset using the inspect_balloon_data.ipynb notebook and it still doesn't seem to be including the additional files. @keineahnung2345 is there any other way to check that they're all included or shouldn't the full BG-included dataset be included when you call dataset.prepare() and then look at dataset.num_images?

Thanks for adding this!

patrickcgray avatar Jan 14 '19 16:01 patrickcgray

I've tried this in hoping it could help learning from images without any masks (i.e. images only have "background"). Yet I had similar issues as others expressed in this thread. The new model with learning from BKGD images actually gave more false positive than before. This is exactly the opposite to what I want. The point of using BKGD images is to improve FP (maybe with a little cost of FN). This code doesn't seem to work very well for this purpose.

cfengai avatar Mar 22 '19 17:03 cfengai

I've tried this in hoping it could help learning from images without any masks (i.e. images only have "background"). Yet I had similar issues as others expressed in this thread. The new model with learning from BKGD images actually gave more false positive than before. This is exactly the opposite to what I want. The point of using BKGD images is to improve FP (maybe with a little cost of FN). This code doesn't seem to work very well for this purpose.

Same for me.

Seanspt avatar Jul 23 '19 02:07 Seanspt

I have had problems implementing these changes and I have not been able to train well. Can anyone confirm if he has achieved good results? Is there an option to pass images with empty masks to simulate this behavior? Thanks!

adriaciurana avatar Sep 23 '19 20:09 adriaciurana

@adriaciurana did you ever find out? :) I am also struggling with false positives, and I think adding BG images could help aswell, since the model performs really well on positive samples.

MathiasKahlen avatar Oct 08 '19 08:10 MathiasKahlen

@adriaciurana did you ever find out? :) I am also struggling with false positives, and I think adding BG images could help aswell, since the model performs really well on positive samples.

Sorry for the delay, I apply some changes in the code in order to train the RPN with background images.

In build_rpn_targets (line 1457) function of model.py add if gt_class_ids.shape[0] == 0: rpn_match = -1 * np.ones([anchors.shape[0]], dtype=np.int32) rpn_bbox = generate_random_rois(image_shape, \ config.RPN_TRAIN_ANCHORS_PER_IMAGE, gt_class_ids, gt_boxes) return rpn_match, rpn_bbox

In generate_random_rois (line 1571) function of model.py change: rois_per_box = int(0.9 * count / gt_boxes.shape[0]) by rois_per_box = int(0.9 * count / (gt_boxes.shape[0] + 0.000001))

In data_generation (line 1718) function of model.py remove: if not np.any(gt_class_ids > 0): continue

I have not been able to perform all the appropriate tests, but we do not observe problems in learning.

On the other hand we want to do the training in two phases (we have not yet been able to try it):

  • The whole set RPN + FPN.
  • Only the FPN: Activate USE_RPN_ROIS = False (config.py) and random_rois = True, detection_targets = True (model.py) That way the FPN receives negatives and has the ability to classify better. False positives are given by an RPN error and because the FPN does not learn negatives well.

adriaciurana avatar Oct 14 '19 08:10 adriaciurana

Thanks a lot, I'll have to try that out when I get the time for it :) I will get back with updates on how it goes.

MathiasKahlen avatar Oct 18 '19 06:10 MathiasKahlen

Thanks!

luxedo avatar Nov 08 '19 17:11 luxedo

Since MaskRCNN is composed of many different parts, so it's hard to say which part of the entire model goes wrong. There is another issue discussing the interaction of RPN, ProposalLayer and DetectionTargetLayer: #726, but we need more evidences. One can use the revised run_graph: #1065 to check output of each stage in training mode, so more detailed analysis will be possible.

@keineahnung2345

Thanks for you comment. would you share code segment please to use run_graph? I want to check output of each stage during training?

AI-ML-Enthusiast avatar Jan 06 '20 10:01 AI-ML-Enthusiast

@adriaciurana did you ever find out? :) I am also struggling with false positives, and I think adding BG images could help aswell, since the model performs really well on positive samples.

Sorry for the delay, I apply some changes in the code in order to train the RPN with background images.

In build_rpn_targets (line 1457) function of model.py add if gt_class_ids.shape[0] == 0: rpn_match = -1 * np.ones([anchors.shape[0]], dtype=np.int32) rpn_bbox = generate_random_rois(image_shape, \ config.RPN_TRAIN_ANCHORS_PER_IMAGE, gt_class_ids, gt_boxes) return rpn_match, rpn_bbox

In generate_random_rois (line 1571) function of model.py change: rois_per_box = int(0.9 * count / gt_boxes.shape[0]) by rois_per_box = int(0.9 * count / (gt_boxes.shape[0] + 0.000001))

In data_generation (line 1718) function of model.py remove: if not np.any(gt_class_ids > 0): continue

I have not been able to perform all the appropriate tests, but we do not observe problems in learning.

On the other hand we want to do the training in two phases (we have not yet been able to try it):

  • The whole set RPN + FPN.
  • Only the FPN: Activate USE_RPN_ROIS = False (config.py) and random_rois = True, detection_targets = True (model.py) That way the FPN receives negatives and has the ability to classify better. False positives are given by an RPN error and because the FPN does not learn negatives well.

It's work fine. Thank you!!!

konstantin-frolov avatar Jul 01 '20 14:07 konstantin-frolov

I tried to use this PR for a single-class instance segmentation task, but the trained model produced more false positives.

I looked up mrcnn_class_loss_graph and noticed that ROIs are removed from loss calculation if corresponding target_class_ids is 0. I know it removes zero-padding, but isn't BG class id 0 too?

I'm not sure if this is applicable to other tasks, but I was able to reduce false positive by changing the following line non_zeros = tf.cast(tf.greater(target_class_ids, 0), 'float32') to non_zeros = tf.cast(tf.math.not_equal(tf.reduce_sum(pred_class_logits, axis=-1), 0), 'float32')

Spritaro avatar Aug 19 '20 13:08 Spritaro

non_zeros = tf.cast(tf.greater(target_class_ids, 0), 'float32')

Hi, i can't find that string in model.py. Where you find it?

konstantin-frolov avatar Aug 19 '20 13:08 konstantin-frolov

@konstantin-frolov It's from files changed in this pull request. You can also find it here https://github.com/keineahnung2345/Mask_RCNN/blob/168ca5cdb7f5656722ec6ed2dd67451cde994421/mrcnn/model.py#L1118

Spritaro avatar Aug 19 '20 14:08 Spritaro

It's from files changed in this pull request. You can also find it here

Thank's. I used approach from @adriaciurana for multi class detection. But when try detect single class (BG and object) only BG images have no effect for FP.

konstantin-frolov avatar Aug 19 '20 15:08 konstantin-frolov

I tried to use this PR for a single-class instance segmentation task, but the trained model produced more false positives.

I looked up mrcnn_class_loss_graph and noticed that ROIs are removed from loss calculation if corresponding target_class_ids is 0. I know it removes zero-padding, but isn't BG class id 0 too?

I'm not sure if this is applicable to other tasks, but I was able to reduce false positive by changing the following line non_zeros = tf.cast(tf.greater(target_class_ids, 0), 'float32') to non_zeros = tf.cast(tf.math.not_equal(tf.reduce_sum(pred_class_logits, axis=-1), 0), 'float32')

@Spritaro that is precisely the problem with this PR IMO. By filtering zero-valued target_class_ids out of the mrcnn_class_loss_graph, the model cannot learn to predict BG for bad proposals anymore – which becomes critical after allowing empty images in the training loop, or if you have images with few/sparse instances.

Your fix works because the class logits for true (BG) zeros are never zero everywhere. I have dealt with this myself by looking at the (indices of) all-zero bbox predictions. (Also, the same thing needs to be done for mrcnn_bbox_loss_graph.)

But there's one further ingredient required to learn BG well and avoid FP in my experience: In detection_targets_graph when positive targets are filled up with negative targets up to ROI_POSITIVE_RATIO, the problem is that if there are too few positives (again from no or too few GT instances) then that ratio won't hold and there will be too few or no negatives. So the targets will simply be padded with zeros. The only way to fix this is to ensure a constant number of negatives.

You can find an implementation of that here.

The recipe proposed by @adriaciurana of running with random_rois = True at a later stage to make the FPN robust to bad proposals is still valid (if not as pressing anymore) though, especially if the RPN gets "too good" early on.

bertsky avatar Sep 19 '20 12:09 bertsky

Hi everybody, first of all thanks for all the insightful information and implementations! Currently, I am training a model to detect objects of only one class (If present on the image). I came across the same issue of having a model performing quite well when trained on images with at least one instance of the object present. However, it predicts a rather large amount of false positives, i.e. detects objects where no objects are. I have tried the suggestions of this PR. Unfortunately, after training now my model predicts even more false positives and all the predictions have detection confidence of 1.0. Did anybody else encounter such behavior?

philippmaa avatar Aug 02 '21 14:08 philippmaa

@bertsky hi thank you very much for this. When training with your code, do I need to put background images or images with undesired objects in my train and val dataset? I am training with 1 class only: which is concrete crack. My trained model apparently detects other objects such as humans, hands, even straight line or shadows as "crack". So may false positives.

Also, do I need to change anything else aside from the model.py? I'm really new to cnn and such.

MatchaCookies avatar Jun 23 '22 07:06 MatchaCookies