albumentations
albumentations copied to clipboard
Bbox results after tranformations are off
🐛 Bug
To Reproduce
I load the same image with my training and validation augmentation stacks.
My val augmentation stack:
self.val_transforms = A.Compose(
[
# LETTERBOX WITH ALBUMENTATIONS OPERATIONS
# https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.LongestMaxSize
A.geometric.resize.LongestMaxSize(self.imgsz),
# https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PadIfNeeded
A.geometric.transforms.PadIfNeeded(self.imgsz, self.imgsz, border_mode=0, value=(114, 114, 114)),
# https://albumentations.ai/docs/api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensorV2
# The numpy HWC image is converted to pytorch CHW tensor.
ToTensorV2()
],
bbox_params=A.BboxParams(format='coco', label_fields=['category_ids']), # COCO: source format
)
My train augmentation stack:
self.train_transforms = A.Compose(
[
# COLOR TRANFORMATIONS --------------
A.augmentations.transforms.ColorJitter(p=0.1),
A.augmentations.transforms.Sharpen(p=0.1),
A.augmentations.transforms.ToGray(p=0.1),
# GEOMETRICAL TRANFORMATIONS --------------
A.augmentations.geometric.transforms.HorizontalFlip(p=0.5), # flip image on its vertical axis
A.augmentations.geometric.transforms.Affine(
translate_percent=0.1,
rotate=4,
shear=2,
scale=(0.9, 1.1),
mode=0,
cval=(114, 114, 114),
p=0.9
),
A.augmentations.transforms.ImageCompression(quality_lower=50, p=0.1),
# LETTERBOX WITH ALBUMENTATIONS OPERATIONS
# https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.LongestMaxSize
A.geometric.resize.LongestMaxSize(self.imgsz),
# https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PadIfNeeded
A.geometric.transforms.PadIfNeeded(self.imgsz, self.imgsz, border_mode=0, value=(114, 114, 114)),
# https://albumentations.ai/docs/api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensorV2
# The numpy HWC image is converted to pytorch CHW tensor when using this augmentation
ToTensorV2()
],
bbox_params=A.BboxParams(format='coco', label_fields=['category_ids']), # COCO: source format
)
Image result after going though the validation dataloader (only letterbox'ed):
Image result after going though the training dataloader (color + geometrical + letterbox operations)
Expected behavior
This is clearly wrong. The bboxes should be much tighter. Affine
seems to be causing this. Is this a bug or am I missing something? This issue is similar to: https://github.com/albumentations-team/albumentations/issues/1373
Environment
- Albumentations version (e.g., 0.1.8): 1.3.0
- Python version (e.g., 3.8): 3.8
- OS (e.g., Linux): Linux
- How you installed albumentations (
conda
,pip
, source): pip - Any other relevant information:
Additional context
Is this on your roadmap? @onurtore, @Dipet
Nope, not mine
I think this is not a bug. As pointed out in #746, to fit the rotated bounding box on the targets, information about the shape of the target is needed. Setting rotate_method="ellipse" might mitigate your issue, but if you have rectangle targets, it might make too small a bounding box because the corners will be cut off. See also #1203 or https://openaccess.thecvf.com/content/ICCV2021/papers/Kalra_Towards_Rotation_Invariance_in_Object_Detection_ICCV_2021_paper.pdf
I think this is not a bug. As pointed out in #746, to fit the rotated bounding box on the targets, information about the shape of the target is needed. Setting rotate_method="ellipse" might mitigate your issue, but if you have rectangle targets, it might make too small a bounding box because the corners will be cut off. See also #1203 or https://openaccess.thecvf.com/content/ICCV2021/papers/Kalra_Towards_Rotation_Invariance_in_Object_Detection_ICCV_2021_paper.pdf
Very helpful comment. Really appreciate it @i-aki-y . So you suggestion is to not rotate by Affine
which don't have rotate_method
option and instead use Rotate
(which has this option)? Can shear also lead to these type of behavior or is it only rotate?
So you suggestion is to not rotate by Affine which don't have rotate_method option and instead use Rotate (which has this option)?
Yes
Can shear also lead to these type of behavior or is it only rotate?
See below:
I see, thanks again. So, Affine
is comprised of:
- Translation ("move" image on the x-/y-axis)
- Rotation
- Scaling ("zoom" in/out)
- Shear (move one side of the image, turning a square into a trapezoid)
by using ShiftScaleRotate
three of these operations would be covered and it has the rotate_method
option. Couldn't find any Shear
operation with rotate_method
... Do you have a better suggestion @i-aki-y?
On a different note. After this conversation it is clear to me that rotate_method='ellipse' is clearly superior to 'largest_box' for most applications. Maybe it should be the standard in albumentations for handling bbox transformations?
Another similar issue: https://github.com/albumentations-team/albumentations/issues/182
@mikel-brostrom I think there is no easy workaround for using shear transform with ellipse rotation. As I showed above, a shear operation introduces extra spaces between the target and the bbox. So if you include shear operation in random affine transform, some extra spaces will appear. I think we need to generalize the ellipse rotation to account for the shear effect.
@mikel-brostrom I made a PR.
This PR might fix this issue #1394. Thank you so much @i-aki-y
Looks like everything works.