bevfusion
bevfusion copied to clipboard
points projection may be incorrect in vtransform module
https://github.com/mit-han-lab/bevfusion/blob/c53e2283b0ebd00b5adbed8b3de0d39362ad3287/mmdet3d/models/vtransforms/base.py#L246
points
do not contain only sample lidar points but multi sweep lidar points, so the projection to image is incorrect and the depth of moving objects is wrong.
I remember that we align all the LiDAR point clouds to the same reference frame (the frame with latest timestamp). As a result, there should be no problem in the projection. If you feel that any alternative implementation makes more sense to you, please feel free to let us know, thank you.
All the LiDAR point clouds have been truly aligned to the same reference frame with sensor2lidar transform, but the previous moving object points will not align to the current objects because of velocity. When the previous moving object points project to the sample frame image, it causes depth errors.
I think the code should be like this:
cur_coords = points[b][points[b][:, -1]==0][:, :3].transpose(1, 0)
only project sample frame lidar points to sample image
Thank you for the nice visualization. Here is my perspective when we are designing this projection. We actually don't associate points from previous sweeps with an object. Instead, we think there is a 3D point in the space, and if it can be back projected onto the image plane, we get the depth of a certain pixel on the image plane (which is exactly the depth of this 3D point). In this case, we believe that such a projection still provides valid information. Your suggestion is definitely a good idea. We will have a look at it in the future.
Thank you for the nice visualization. Here is my perspective when we are designing this projection. We actually don't associate points from previous sweeps with an object. Instead, we think there is a 3D point in the space, and if it can be back projected onto the image plane, we get the depth of a certain pixel on the image plane (which is exactly the depth of this 3D point). In this case, we believe that such a projection still provides valid information. Your suggestion is definitely a good idea. We will have a look at it in the future.
Thank you for your reply.I just consider points from previous nine sweeps may be projectes to the same pixel as current sweep points (2.5w points + )because multi-sweep points are dense (about 25w+ points ) and discrete in local patch. I will do some more experiments in the future.
That's a good point.
Temporarily closed due to inactivity. Will route people to this thread if there are similar discussions.