keras-cv
keras-cv copied to clipboard
Adds supports for keypoints in augmentation
What does this PR do?
Fixes #518, Also kind of adresses #533 remarks that shear does not support bounding box
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue? Please add a link to it if that's the case.
- [x] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
@atuleu It's great addition. If possible, let's add a code example for keypoint augmentation.
@innat
Here is how you could use this:
from keras_cv import layers
import numpy as np
layer = layers.RandomRotation()
augmented = layer({'images': np.random.random((512,512,3)).astype("float32"),
'labels':[1,1,2], # each keypoints has a label
# keypoints are list of 2D points for each labels. Each labels can have different number of keypoints
'keypoints':tf.ragged.constant([[[[1,2],[3,4]],[[5,6]],[[7,8],[9,10]]],[[[11,12],[13,14]],[]]])})
@atuleu yes, like to add code example in here https://github.com/keras-team/keras-cv/tree/master/examples/layers/preprocessing for reference.
Hey @atuleu! Thanks for your enthusiasm and contributions!
Keypoints is definitely in our roadmap, but I want to make sure we get the API right. I'll be sure to spend some time to look over this label type and how other frameworks handle it so we can compare the different approaches.
Are keypoints basically always xy
pairs? Are they ever normalized to the image size? These are the questions we need to be sure to answer.
So I think this (unfortunately large) work is a bit more ready for review, as discussed in #518 :
- It adds in
keras_cv.keypoint
format description and conversion functions. Currently format supported are'xy'
and'rel_xy'
- Composes original keras augmentation layer to inherit
keras_cv.layers.preprocessing.BaseImageAugmentationLayer
and supportaugment_keypoint()
andaugment_bounding_boxes()
(most of the time by performing a NOOP) - Modifies
with_labels
unit test towith_keys
test, and check that all listed layers supports bounding boxes and keypoints augmentation both in batched and unbatched format. - Adds support for keypoint and bounding boxes in
RandomRotation
RandomTranslation
andRandomShear
. The bounding box augmentation are infered from the corner augmentation with the help ofkeras_cv.bounding_box.transform_from_point_transform()
to re-use keypoint augmentation for bounding box augmentation - Adds a
clip_points_to_image_size
option to the above layers to enable clipping / discarding of bounding boxes / keypoints to image size. - Adds a proper demo on the lightweight TFDS AFWL20003D dataset with the three geometric augmentation currently supported.
I still have two questions:
- when discarding keypoints falling outside of the image, the augmented keypoints are returned as a ragged tensor. However it does not work nicely with
tf.vectorized_map
ortf.map_fn
nicely. At the moment, I just noticed in the documentation that you should not use clipping on batched keypoint data. Finding this condition at runtime to raise an Unsupported error is not easy ( and the default error will be really cryptic). How should we handle this case ? - Augmenting bounding_boxes and keypoints with
fill_mode
sets to 'reflect' or 'tile' may create new landmarks or object in the augmented image without any associated keypoints or bounding boxes. Should this setting be detected at runtime and a proper error be raised ?
I would happily discuss these changes and improve them @LukeWood @tanzhenyu
So I think this (unfortunately large) work is a bit more ready for review, as discussed in #518 :
- It adds in
keras_cv.keypoint
format description and conversion functions. Currently format supported are'xy'
and'rel_xy'
- Composes original keras augmentation layer to inherit
keras_cv.layers.preprocessing.BaseImageAugmentationLayer
and supportaugment_keypoint()
andaugment_bounding_boxes()
(most of the time by performing a NOOP)- Modifies
with_labels
unit test towith_keys
test, and check that all listed layers supports bounding boxes and keypoints augmentation both in batched and unbatched format.- Adds support for keypoint and bounding boxes in
RandomRotation
RandomTranslation
andRandomShear
. The bounding box augmentation are infered from the corner augmentation with the help ofkeras_cv.bounding_box.transform_from_point_transform()
to re-use keypoint augmentation for bounding box augmentation- Adds a
clip_points_to_image_size
option to the above layers to enable clipping / discarding of bounding boxes / keypoints to image size.- Adds a proper demo on the lightweight TFDS AFWL20003D dataset with the three geometric augmentation currently supported.
I still have two questions:
- when discarding keypoints falling outside of the image, the augmented keypoints are returned as a ragged tensor. However it does not work nicely with
tf.vectorized_map
ortf.map_fn
nicely. At the moment, I just noticed in the documentation that you should not use clipping on batched keypoint data. Finding this condition at runtime to raise an Unsupported error is not easy ( and the default error will be really cryptic). How should we handle this case ?- Augmenting bounding_boxes and keypoints with
fill_mode
sets to 'reflect' or 'tile' may create new landmarks or object in the augmented image without any associated keypoints or bounding boxes. Should this setting be detected at runtime and a proper error be raised ?I would happily discuss these changes and improve them @LukeWood @tanzhenyu
Wow @atuleu - this is an incredibly contribution! Thank you for providing this.
Just a heads up, we didn't have this on our roadmap so I haven't been prioritizing it in review. That being said... this is really great - and so I think we should allocate some resources to reviewing this. I will begin reviewing this ASAP - does that sound good @atuleu ? Thanks again for the effort you've put into this, this is awesome!
Hey @atuleu - this PR is great, but it is HUGE!
Is there any chance you can split this into several smaller PRs? My suggestion is as follows:
- create a PR adding support for keypoint formats & utilities (discard out of image, converters, etc)
- update the BaseImageAugmentationLayer in a PR
- add ALL of the no-op augmentations in a single PR; so all of the ones that just return keypoints
- for each augmentation layer that requires specific logic (rotation/shear), update that in a standalone PR, add demo with these components
- that should cover everything
Please let me know if this sounds like a good plan to you. I don't think I can review 2k lines of code without missing any sort of small errors/stylistic typos.
Sure I understand it is not possible to review a large PR like this one. I will split it in several PR ( see the complete list in #518 ). For demonstration, this is what [atuleu/keras-cv@dev/keypoints] can do, on the AFLW2000 dataset (68 face keypoint per image) :
Sure I understand it is not possible to review a large PR like this one. I will split it in several PR ( see the complete list in #518 ). For demonstration, this is what [atuleu/keras-cv@dev/keypoints] can do, on the AFLW2000 dataset (68 face keypoint per image) :
Thank you for this, this image is really promising.
@innat
Here is how you could use this:
from keras_cv import layers import numpy as np layer = layers.RandomRotation() augmented = layer({'images': np.random.random((512,512,3)).astype("float32"), 'labels':[1,1,2], # each keypoints has a label # keypoints are list of 2D points for each labels. Each labels can have different number of keypoints 'keypoints':tf.ragged.constant([[[[1,2],[3,4]],[[5,6]],[[7,8],[9,10]]],[[[11,12],[13,14]],[]]])})
Hi, I want to use it in pytorch. Could you provide me with a simple example to use the similar data augumentation?
Sure I understand it is not possible to review a large PR like this one. I will split it in several PR ( see the complete list in #518 ). For demonstration, this is what [atuleu/keras-cv@dev/keypoints] can do, on the AFLW2000 dataset (68 face keypoint per image) :
This looks very promising. +1 on Luke's suggestion on breaking this up in 4 steps. This is a huge contribution too -- if you want to create a design review under Keras governance that'd be great as well.
Hello @atuleu, Any update to this PR?
Hello @atuleu, Any update to this PR?
Still wainting for some PR to be accepted to continue the work
@atuleu Given this is aligned with our Q1 roadmap, I will be helping review all keypoint related PRs. Are you still working on it, and should I review this PR?
Thank you for your enthusiasm on this PR @atuleu ! This is awesome;
As some of these changes have already been merged, and the rest are stale due to shifts in the KPL structures I'm closing this PR. If you want to continue working on this please feel free; but lets try to add keypoint support one layer at a time as we were before!
Thanks for your enthusiasm and hard work on this; it looks awesome from the visuals!