bevfusion points projection may be incorrect in vtransform module

https://github.com/mit-han-lab/bevfusion/blob/c53e2283b0ebd00b5adbed8b3de0d39362ad3287/mmdet3d/models/vtransforms/base.py#L246

points do not contain only sample lidar points but multi sweep lidar points, so the projection to image is incorrect and the depth of moving objects is wrong.

Aug 08 '22 06:08 songlilucky

I remember that we align all the LiDAR point clouds to the same reference frame (the frame with latest timestamp). As a result, there should be no problem in the projection. If you feel that any alternative implementation makes more sense to you, please feel free to let us know, thank you.

Aug 08 '22 07:08 kentang-mit

All the LiDAR point clouds have been truly aligned to the same reference frame with sensor2lidar transform, but the previous moving object points will not align to the current objects because of velocity. When the previous moving object points project to the sample frame image, it causes depth errors.

wecom-temp-d2c4f7c352ac82ba5ff48702276a69eb

Aug 08 '22 09:08 songlilucky

I think the code should be like this: cur_coords = points[b][points[b][:, -1]==0][:, :3].transpose(1, 0) only project sample frame lidar points to sample image

Aug 08 '22 10:08 songlilucky

Thank you for the nice visualization. Here is my perspective when we are designing this projection. We actually don't associate points from previous sweeps with an object. Instead, we think there is a 3D point in the space, and if it can be back projected onto the image plane, we get the depth of a certain pixel on the image plane (which is exactly the depth of this 3D point). In this case, we believe that such a projection still provides valid information. Your suggestion is definitely a good idea. We will have a look at it in the future.

Aug 08 '22 22:08 kentang-mit

Thank you for the nice visualization. Here is my perspective when we are designing this projection. We actually don't associate points from previous sweeps with an object. Instead, we think there is a 3D point in the space, and if it can be back projected onto the image plane, we get the depth of a certain pixel on the image plane (which is exactly the depth of this 3D point). In this case, we believe that such a projection still provides valid information. Your suggestion is definitely a good idea. We will have a look at it in the future.

Thank you for your reply.I just consider points from previous nine sweeps may be projectes to the same pixel as current sweep points (2.5w points + )because multi-sweep points are dense (about 25w+ points ) and discrete in local patch. I will do some more experiments in the future.

Aug 09 '22 01:08 songlilucky

That's a good point.

Aug 09 '22 03:08 kentang-mit

Temporarily closed due to inactivity. Will route people to this thread if there are similar discussions.

Aug 15 '22 16:08 kentang-mit

bevfusion bevfusion copied to clipboard

points projection may be incorrect in vtransform module

bevfusion
bevfusion copied to clipboard