keras-cv icon indicating copy to clipboard operation
keras-cv copied to clipboard

Mosaic Augmentation for Object Detection

Open innat opened this issue 2 years ago • 14 comments

Mosaic augmentation for object detection is used in Yolo-V4 literature, FR. I'm not sure if it's used for classification tasks there. I was wondering if it's possible to do so for the image classification task. Here are two issues:

  • References. Not sure if it's ever used for classification only in any literature.
  • Creation of class labels.

For creating class labels, here is one possible solution, described HERE; using Dirichlet distribution.

# for 2 images. Equivalent to λ and (1-λ)
>>> np.random.dirichlet((1, 1), 1)  
array([[0.92870347, 0.07129653]])  

>>> np.random.dirichlet((1, 1, 1), 1)  # for 3 images.
array([[0.38712673, 0.46132787, 0.1515454 ]])

>>> np.random.dirichlet((1, 1, 1, 1), 1)  # for 4 images.
array([[0.59482542, 0.0185333 , 0.33322484, 0.05341645]])

As mosaic takes 4 images. inbox_1984321_b0530886377eb3d7f0c29401ba069ffa_download (1)


update:

Mosaic augmentation for classification did use in literature. Please check: https://github.com/keras-team/keras-cv/issues/250#issuecomment-1130256698

innat avatar Apr 01 '22 08:04 innat

this definitely useful layer

kartik4949 avatar Apr 01 '22 18:04 kartik4949

If anyone is interested, you can follow up on the conversation here. https://github.com/AlexeyAB/darknet/issues/7088#issuecomment-757527542

innat avatar Apr 14 '22 08:04 innat

Would be happy to take this one.

artu1999 avatar May 17 '22 18:05 artu1999

@artu1999 Great. But note that, you may need to validate the creation of mosaic labels for the classification task. I'm not sure if there are any references (paper-work). The above approach (dirichlet) is just a pointer.

innat avatar May 17 '22 20:05 innat

@innat I can dig deeper into it, but it looks like yolov4 is the only one that references it for image classification. Maybe we can try and ask them again to point us at the label creation they used for image classification?

Also, just to clarify, with validation do you mean verifying that the probabilities you’d get from the dirichlet distribution are proportionate to the “sub-images” sizes in the mosaic?

artu1999 avatar May 18 '22 10:05 artu1999

Also, just to clarify, with validation do you mean verifying that the probabilities you’d get from the dirichlet distribution are proportionate to the “sub-images” sizes in the mosaic?

I mean, whatever the approach would be (the label creation)), should be useful for the classification task (performance like cutmix and mixup).

innat avatar May 18 '22 16:05 innat

cc. @AlexeyAB Mentioning Alex, to give some advice here. It would be really helpful.

innat avatar May 18 '22 16:05 innat

We set the ground truth probability in proportion to the area occupied by each sample: https://github.com/AlexeyAB/darknet/blob/4ee3be7e68fb9c7eda5cc390e47e59f01e40dded/src/data.c#L1913-L1942 So if areas for: Car=10%, Table=20%, Cat=30%, Dog = 40%, then we set labels: Car=0.1, Table=0.2, Cat=0.3, Dog=0.4.

We have not tried using it in any other way.

Mosaic for Classifier reduces Top5-error from 6.0% to 5.5% (-10 relative %) on Imagenet-1k, and it works better than CutMix/MixUp, Tables 2 and 3: https://arxiv.org/abs/2004.10934


We used Mosaic data augmentation:

  • for Classifier: YOLOv4
  • for Detector: YOLOv4, Scaled-YOLOv4, YOLOv5, YOLOR: https://github.com/WongKinYiu/yolor

We didn't use Mosaic for Classifier in (Scaled-YOLOv4, YOLOv5, YOLOR), just because they don't need Classification or any pre-trained weights. We train these Detector from scratch to achieve the best accuracy/speed ratio.

It was originally introduced in the YOLOv4 paper: https://arxiv.org/abs/2004.10934

Some later work has used Mosaic:

  • https://arxiv.org/abs/2004.12432
  • https://arxiv.org/abs/2004.12178
  • https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Scaled-YOLOv4_Scaling_Cross_Stage_Partial_Network_CVPR_2021_paper.html
  • https://arxiv.org/abs/2105.04206 ...

AlexeyAB avatar May 18 '22 16:05 AlexeyAB

@innat @LukeWood I'm working on an implementation based on what Alex suggested. I managed to get the values for the labels right but I am having a few issues with creating the mosaic. Looking at CutMix as a reference, I am using fill_rectangles to draw the mosaic images over the input. This works fine when I fill only a single one of the sub-images, however when I try to do it for all three images I get a mixed up result (see the notebook for reference). Any idea why this happens?

artu1999 avatar May 23 '22 14:05 artu1999

Not sure but some suspicious feelings about the creation of mosaic_x/y, s1-s4, center_x, center_y. Can you check this and this implementation, it might give you some pointers.

innat avatar May 23 '22 19:05 innat

mentioning https://github.com/keras-team/keras-cv/issues/21 to de-duplicate them.

LukeWood avatar Jun 24 '22 04:06 LukeWood

@innat @LukeWood I'm working on an implementation based on what Alex suggested. I managed to get the values for the labels right but I am having a few issues with creating the mosaic. Looking at CutMix as a reference, I am using fill_rectangles to draw the mosaic images over the input. This works fine when I fill only a single one of the sub-images, however when I try to do it for all three images I get a mixed up result (see the notebook for reference). Any idea why this happens?

hey @artu1999 any progress here? Sorry I missed your comment.

LukeWood avatar Jun 27 '22 22:06 LukeWood

@innat @LukeWood I'm working on an implementation based on what Alex suggested. I managed to get the values for the labels right but I am having a few issues with creating the mosaic. Looking at CutMix as a reference, I am using fill_rectangles to draw the mosaic images over the input. This works fine when I fill only a single one of the sub-images, however when I try to do it for all three images I get a mixed up result (see the notebook for reference). Any idea why this happens?

hey @artu1999 any progress here? Sorry I missed your comment.

Hey @LukeWood, sorry for the inactivity recently, I’ve been travelling in the last few weeks and didn’t really have time to sit down properly and wrap it up. At the moment, I’ve got the preprocessing function, but before pushing I would like to test it on a pre-trained model to see if it brings any benefits for classification. I am going back in London in two days, will push some updates in the next few days.

artu1999 avatar Jun 28 '22 16:06 artu1999

Don't sweat it @artu1999 ! Enjoy your trip

LukeWood avatar Jun 28 '22 17:06 LukeWood

@quantumalaviya @AdityaKane2001 any interest in working on this?

LukeWood avatar Aug 17 '22 22:08 LukeWood