pytorch3d box3d_overlap gives unexpected results even with correct order of vertices

🐛 Bugs / Unexpected behaviors

box3d_overlap gives unexpected results even with correct order of vertices.

Instructions To Reproduce the Issue:

I have the following code which takes two sets of cuboid points and attempts to use the IoU function. The cuboids are nearly identical, so I expect the IoU to be very close to 1. The order of the vertices are originally not in the expected order, but I do reorder them:

import numpy as np
from torch import FloatTensor
from pytorch3d.ops import box3d_overlap


POINTS_ORDER = [3, 1, 0, 2, 7, 5, 4, 6]
def cuboid_to_tensor(points):
    reordered_points = np.array(points)[POINTS_ORDER]
    tensor = FloatTensor(reordered_points)
    return tensor[None]

cuboid_points_1 = [[-34.67156, -2.65437, -0.95367], [-34.67156, -2.65437, 0.96941], [-33.94158, -4.51639, -0.95367], [-33.94158, -4.51639, 0.96941], [-39.48952, -4.54319, -0.95367], [-39.48952, -4.54319, 0.96941], [-38.75954, -6.40521, -0.95367], [-38.75954, -6.40521, 0.96941]]
cuboid_points_2 = [[-34.67158, -2.65437, -0.95368], [-34.67158, -2.65437, 0.96939], [-33.94159, -4.51638, -0.95368], [-33.94159, -4.51638, 0.96939], [-39.48953, -4.54321, -0.95368], [-39.48953, -4.54321, 0.96939], [-38.75954, -6.40523, -0.95368], [-38.75954, -6.40523, 0.96939]]

tensor_1 = cuboid_to_tensor(cuboid_points_1)
tensor_2 = cuboid_to_tensor(cuboid_points_2)

print(box3d_overlap(tensor_1, tensor_2))  # Returns (25.7176, 1.8253)
print(box3d_overlap(tensor_2, tensor_1))  # Returns (35.8715, 9.1139)

As shown in the comments on the last lines, the IoU returned is greater than 1. Not only that, if I flip the tensor arguments the return value changes. Any idea what is going on here? Thanks in advance 🙂

Aug 09 '22 19:08 psteeves

Ugh! Unfortunately, the issue is numerical. Let me explain!

There is an EPS in the code which performs various checks between box coordinates.

https://github.com/facebookresearch/pytorch3d/blob/276c9a8acbfa715f5802f26ec9f4141bde26ecb4/pytorch3d/csrc/iou_box3d/iou_utils.cuh#L15

Now your boxes are really close (actually, approaching the EPS value), which makes the code behave strangely (and that is our fault!). For example if you slightly perturb your box coordinates, e.g.

print(box3d_overlap(tensor_1, tensor_2 + 0.0001))
> (tensor([[19.9018]]), tensor([[0.9998]]))
print(box3d_overlap(tensor_2 + 0.0001, tensor_1))
> (tensor([[19.9018]]), tensor([[0.9998]]))

Note that I perturbed an order of magnitude more than EPS. I will work on this to find a good solution such that the code does not behave like this for boxes that are below EPS close. When the boxes are not near EPS close, the algorithm should be fine!

Aug 10 '22 10:08 gkioxari

Thank you @gkioxari ! When you say that the boxes are really close (approaching the EPS value), are you referring to the distance between them? Thanks for looking into a solution, in the meantime I may try and always perturb the boxes as I'm not concerned about tiny differences in IoU.

Aug 10 '22 11:08 psteeves

Yes. The distance in the box coordinates is close to EPS. So then in the code, when do some if-else checks (I don't want to go into implementation details!) the EPS can alter the sign of these checks because of that.

For now, add a small perturbation to box2 coordinates (as long as that perturbation is small compared to the absolute value of coordinates). I will work on a more robust solution to this because it bugs me beyond belief!

Aug 10 '22 11:08 gkioxari

For full disclosure, this doesn't happen generally for EPS- close boxes. This numerical instability is particularly triggered by your example. For example, below the boxes are also * very close* but the function outputs the right thing

print(box3d_overlap(tensor_1[None], tensor_1[None] + torch.rand((1,)) * 0.00001))
> (tensor([[19.9038]]), tensor([[1.]]))

It's a bug for your case nevertheless. Investigating it!

Aug 10 '22 13:08 gkioxari

Makes sense. It does seem to happen quite often with the data I'm working with, but if you haven't seen this before it's probably something specific to this distribution. Lmk if I can help with your investigation 🙂

Aug 10 '22 14:08 psteeves

Ah interesting! Could you perhaps provide a few more (3-4) test cases so that I add them to our test function to make sure the solution works for more of your cases and also make sure that future changes to the code don't break them?

Aug 10 '22 18:08 gkioxari

Sure thing, here are some other test cases:

# iou = 2.286954164505005
tensor([[[-105.6248,  -32.7026,   -1.2279],
         [-106.4690,  -30.8895,   -1.2279],
         [-106.4690,  -30.8895,   -3.0279],
         [-105.6248,  -32.7026,   -3.0279],
         [-110.1575,  -34.8132,   -1.2279],
         [-111.0017,  -33.0001,   -1.2279],
         [-111.0017,  -33.0001,   -3.0279],
         [-110.1575,  -34.8132,   -3.0279]]])
tensor([[[-105.5094,  -32.9504,   -1.0641],
         [-106.4272,  -30.9793,   -1.0641],
         [-106.4272,  -30.9793,   -3.1916],
         [-105.5094,  -32.9504,   -3.1916],
         [-110.0421,  -35.0609,   -1.0641],
         [-110.9599,  -33.0899,   -1.0641],
         [-110.9599,  -33.0899,   -3.1916],
         [-110.0421,  -35.0609,   -3.1916]]])

# iou = 1.2187999486923218
tensor([[[-59.4785, -15.6003,   0.4398],
         [-60.2263, -13.6928,   0.4398],
         [-60.2263, -13.6928,  -1.3909],
         [-59.4785, -15.6003,  -1.3909],
         [-64.1743, -17.4412,   0.4398],
         [-64.9221, -15.5337,   0.4398],
         [-64.9221, -15.5337,  -1.3909],
         [-64.1743, -17.4412,  -1.3909]]])
tensor([[[-59.4874, -15.5775,  -0.1512],
         [-60.2174, -13.7155,  -0.1512],
         [-60.2174, -13.7155,  -1.9820],
         [-59.4874, -15.5775,  -1.9820],
         [-64.1832, -17.4185,  -0.1512],
         [-64.9132, -15.5564,  -0.1512],
         [-64.9132, -15.5564,  -1.9820],
         [-64.1832, -17.4185,  -1.9820]]])

# iou = 1.6558291912078857
tensor([[[-167.5847,  -70.6167,   -2.7927],
         [-166.7333,  -72.4264,   -2.7927],
         [-166.7333,  -72.4264,   -4.5927],
         [-167.5847,  -70.6167,   -4.5927],
         [-163.0605,  -68.4880,   -2.7927],
         [-162.2090,  -70.2977,   -2.7927],
         [-162.2090,  -70.2977,   -4.5927],
         [-163.0605,  -68.4880,   -4.5927]]])
tensor([[[-167.5847,  -70.6167,   -2.7927],
         [-166.7333,  -72.4264,   -2.7927],
         [-166.7333,  -72.4264,   -4.5927],
         [-167.5847,  -70.6167,   -4.5927],
         [-163.0605,  -68.4880,   -2.7927],
         [-162.2090,  -70.2977,   -2.7927],
         [-162.2090,  -70.2977,   -4.5927],
         [-163.0605,  -68.4880,   -4.5927]]])

Aug 10 '22 20:08 psteeves

@psteeves The issue is now fixed by https://github.com/facebookresearch/pytorch3d/commit/1bfe6bf20a1de877cc623d11c2eeed8c7091ae90

I added your test cases in our tests. Some of them I couldn't quite reproduce even with the original code. For example, the 1st example in your latest message was returning something around 0.5ish. In any case, it doesn't matter. The new code should be doing the right thing for very close boxes.

I am closing the issue! If you encounter new problems please post a new issue!

Aug 22 '22 13:08 gkioxari

pytorch3d pytorch3d copied to clipboard

box3d_overlap gives unexpected results even with correct order of vertices

🐛 Bugs / Unexpected behaviors

Instructions To Reproduce the Issue:

pytorch3d
pytorch3d copied to clipboard