MONAI icon indicating copy to clipboard operation
MONAI copied to clipboard

`apply_affine_to_boxes` does not handle flipping properly, due to the left-closed and right-open nature of the boxes

Open function2-llx opened this issue 1 year ago • 5 comments
trafficstars

Describe the bug apply_affine_to_boxes does not handle flipping properly, due to the left-closed and right-open nature of the boxes. That is to say, when a box is flipped, the left-closed and right-open coordinates become left-open and right closed, which should be maintained.

To be clarified, I understand that there's a flip_boxes function available, but I want to use this example for simplicity. Same issue exists for more complicated affine matrix, as long as containing flipping.

To Reproduce

import torch

from monai.apps.detection.transforms.box_ops import apply_affine_to_boxes
from monai.data import MetaTensor
import monai.transforms as mt

def main():
    for flip_axis in range(3):
        flip = mt.Flip(spatial_axis=flip_axis)
        x = torch.zeros(1, 1, 1, 1)
        flipped: MetaTensor = flip(x)
        box = torch.tensor([[0, 0, 0, 1, 1, 1]])
        box_flipped = apply_affine_to_boxes(box, flipped.affine.inverse())
        print(box_flipped)

if __name__ == '__main__':
    main()

Expected behavior Produce the same results as flip_boxes.

Actual Results

tensor([[-1,  0,  0,  0,  1,  1]])
tensor([[ 0, -1,  0,  1,  0,  1]])
tensor([[ 0,  0, -1,  1,  1,  0]])

Suggestion My suggestion for the fix is to convert the box coordinates to be closed on the both sides before applying the affine matrix, and convert them back to the desired format after applying the affine.

Additional Context

If I understand correctly, the left-closed and right open nature is suggested here: https://github.com/Project-MONAI/MONAI/blob/e1a69b03c86ce065db2816b696ea4a6b57d46435/monai/data/box_utils.py#L40-L45

Feature Request

This is not about the major topic of this issue, but by the way, I would like to mention that in the code above I calculate the inverse of the affine matrix. Currently, I only find this way works to apply arbitrary affine transform to boxes. It would be nice if apply_affine_to_boxes could do it for be internally by solving a linear equation for numerical stability.

function2-llx avatar May 04 '24 17:05 function2-llx

Hi @function2-llx, I'm not exactly sure what outcome you're expecting. It appears the result is correct as the six digits represent coordinates in the XYZXYZ format.

Thanks.

KumoLiu avatar May 06 '24 07:05 KumoLiu

@KumoLiu Hello, the format is indeed correct, however, the values of coordinates are mismatched with flip_boxes. Let me do it again with this function:

import torch

from monai.apps.detection.transforms.box_ops import apply_affine_to_boxes, flip_boxes
from monai.data import MetaTensor
import monai.transforms as mt

def main():
    for flip_axis in range(3):
        flip = mt.Flip(spatial_axis=flip_axis)
        x = torch.zeros(1, 1, 1, 1)
        flipped: MetaTensor = flip(x)
        box = torch.tensor([[0, 0, 0, 1, 1, 1]])
        box_flipped = apply_affine_to_boxes(box, flipped.affine.inverse())
        print('wrong:', box_flipped)
        box_flipped_correct = flip_boxes(box, x.shape[1:], flip_axis)
        print('correct:', box_flipped_correct)

if __name__ == '__main__':
    main()

The results are:

wrong: tensor([[-1,  0,  0,  0,  1,  1]])
correct: tensor([[0, 0, 0, 1, 1, 1]])
wrong: tensor([[ 0, -1,  0,  1,  0,  1]])
correct: tensor([[0, 0, 0, 1, 1, 1]])
wrong: tensor([[ 0,  0, -1,  1,  1,  0]])
correct: tensor([[0, 0, 0, 1, 1, 1]])

function2-llx avatar May 06 '24 07:05 function2-llx

Hi @function2-llx, the difference you're seeing arises from the second argument you're passing into the flip_boxes function, which is the spatial_size of the image. The spatial_size parameter dictates the axis along which you intend to flip. For more details, please take a look at the code here. And if you set it to [0, 0, 0], you will get the same result with the apply_affine_to_boxes.

This comment in the discussion about the issue might also provide some helpful context.

Thanks.

KumoLiu avatar May 06 '24 08:05 KumoLiu

The spatial_size parameter dictates the axis along which you intend to flip.

I'm sorry if I misunderstand anything, but isn't the spatial_size parameter literally indicating the size of the image that the boxes are on?

https://github.com/Project-MONAI/MONAI/blob/e1a69b03c86ce065db2816b696ea4a6b57d46435/monai/apps/detection/transforms/box_ops.py#L161-L169

function2-llx avatar May 06 '24 08:05 function2-llx

Hi @function2-llx,

but isn't the spatial_size parameter literally indicating the size of the image that the boxes are on?

Yes, for this flip_boxes and for detection. However, boxes can only be boxes; geometric data can be without reference. So the flip_boxes here is only used to detection application.

KumoLiu avatar May 06 '24 09:05 KumoLiu