Mask_RCNN icon indicating copy to clipboard operation
Mask_RCNN copied to clipboard

How augmentation is done in this implementation of mask rcnn?

Open hrithikpuri5 opened this issue 5 years ago • 7 comments

When I apply following data augmentation technique

iaa.Sequential([iaa.Fliplr(0.5)])

will the augmentation be applied to each training set and both non augmented and augmented images are used as training sample or just the augmented image will be used.

hrithikpuri5 avatar Dec 26 '19 14:12 hrithikpuri5

@hrithikpuri5 from what I understand from model.py it applies augmentation on every image. Now in your case the probability of Flip is 0.5 in in any epoch, the chances of applying augmentation are 50%. So running multiple iterations might ensure that both original and flipped images will be used. But that also means the network may see same image multiple times.

In another post (https://github.com/matterport/Mask_RCNN/issues/1924) I posted a suggestion where multiple augmentors are used with different values / probabilities. This is to ensure that even if the network comes across same image, there is a high chance that some augmentation will be applied. And am I concerned with original image? No. Any image that results from augmentation might have been an original!

vijaygill avatar Jan 09 '20 00:01 vijaygill

So you are saying that there might be a case where the model might not even see the original image but the augmented version?

hrithikpuri5 avatar Jan 09 '20 13:01 hrithikpuri5

So you are saying that there might be a case where the model might not even see the original image but the augmented version?

Yes, due to the probabilities involved.

vijaygill avatar Jan 09 '20 15:01 vijaygill

So you are saying that there might be a case where the model might not even see the original image but the augmented version?

Seems you really want the model to see your original images, then why not call model.train twice in your training script, once with no augmentations and second time with augmentation.

But I still feel that worrying about original vs augmented images is not really needed because training is about extracting features and from machine's (or machine learning's) point of view, any version of augmented images could be original!

For example, look at https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&r=false&c=%2Fm%2F06pcq. I have trained a model (for some classes only though) using this dataset + segmentation information. If any of those picture is say rotated by an angle of 30 degrees, does it lose its "originality" from training point of view?

vijaygill avatar Jan 11 '20 13:01 vijaygill

Dear guys,

I have a question regarding the augmentation. When for example an image is fliped, will the ground truth polygon for the particular instance in the image also fliped? I think it have to be so. Because if the image is fliped, then the instance is in another position, right?

harezhh avatar Dec 17 '21 15:12 harezhh

Dear guys,

I have a question regarding the augmentation. When for example an image is fliped, will the ground truth polygon for the particular instance in the image also fliped? I think it have to be so. Because if the image is fliped, then the instance is in another position, right?

Yes it does flip the mask also - see the line in code at following location - https://github.com/matterport/Mask_RCNN/blob/3deaec5d902d16e1daf56b62d5971d428dc920bc/mrcnn/model.py#L1228

vijaygill avatar Dec 17 '21 22:12 vijaygill

Hi guys, I'm also curious why we set the flip probabilities to say 0.5? Does it get better performance than doing it with 100% flip?

sean880304 avatar Jul 13 '22 09:07 sean880304