OpenPCDet icon indicating copy to clipboard operation
OpenPCDet copied to clipboard

KITTI dataset doubts

Open pk1996 opened this issue 3 years ago • 7 comments

Hi Thank you for the amazing code base.

I wanted to clarify a few things. I am trying to dubug my changes for training on a different custom dataset and am trying to follow KITTI dataset implementation from you guys.

So in the KITTI dataset, I see that we get the 3D location and rotation in the camera coordinate frame. (https://github.com/bostondiditeam/kitti/blob/master/resources/devkit_object/readme.txt). Now we are translating it to the lidar coordinate for feeding into the model via getitem method.

Doubt1 - In case of datasets with only lidar, we need not do this step right? Since its already in that referece frame. Small redundancy/bug? - I see you guys do this computation in the get_infos method and store as ground_truth_lidar but do not use in the getitem method and end up doing the computations again.

Doubt2 - I am not sure I get the angle transformation applied? Is it to translate to normative coordinates you guys follow in the code base or is it something else? In my case, where the angle is in the lidar coordinate, do I need to make the same change? (https://github.com/open-mmlab/OpenPCDet/issues/128#issuecomment-654554276)

Thanks

pk1996 avatar Jul 26 '22 18:07 pk1996

Hi, first you can refer to my successful expample for custom dataset (kitti format), it includes how to make sure the coordinates correct. url: https://github.com/OrangeSodahub/CRLFnet#lid-cam-fusion, https://github.com/OrangeSodahub/CRLFnet/blob/master/src/site_model/src/LidCamFusion/OpenPCDet/pcdet/datasets/custom/README.md. This is also in pr #1032 welcome to review.

For your doubt1, with only lidar point, transformation for camera is not needed, but make sure that your lidar point and labels (including x,y,z of objects) are in the kitti coordinates before feeding them into model. For me, my raw lidar datasets are in the custom dataset (according to label tools) and label files are in the kitti coordinates so a transformation to kitti coordinates for labels in custom_dataset.py is needed. And it is uesd in get_infos function, this operation will generate the xxxdataset.pkl and this pkl file will be read in getitem function in train.py so the transformation is used in training, you can see the path of pkl are stored in the relative config files .yaml and it will be read when building dataloader.

For doubt2, if your angles are in the same coordinates as the kitti then I think no more changes are needed. But the premise is you understand the kitti coordinates.

OrangeSodahub avatar Jul 27 '22 01:07 OrangeSodahub

Hi

Thanks for the reply!

I will go through your codebase. My changes are here - https://github.com/open-mmlab/OpenPCDet/compare/master...pk1996:OpenPCDet:master

  1. I did ensure that the coordinate system of my data matches that of KITTI LiDAR i.e. that the X,Y,Z align.
  2. I get the flow. My point is that the annotations['gt_boxes_lidar'] already have that computed. It is not a bug.
  3. Could you elaborate on your point? I am unsure because in the kitti_dataset.py the authors compute gt_boxes_lidar as - gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 + rots[..., np.newaxis])], axis=1) similarly in the box_utils, boxes3d_kitti_camera_to_lidar () np.concatenate([xyz_lidar, l, w, h, -(r + np.pi / 2)], axis=-1)

pk1996 avatar Jul 27 '22 01:07 pk1996

Well, if my memory serves me right, the kitti coordinates are in the camera coordinate and this camera is the top camera (located at the top of the car), its axis are shown as left below:

                 x                     y
                 |                     |
                 |                     |
                 |                     |
z----------------y                     z------------------x

 kitti (top camera)                     OpenPCDet

The right one is the uniform coordinates of OpenPCDet. And, my own custom dataset is already in the OpenPCDet coordinates while my label files are in the kitti coordinates (which is shown in my code).

Besides, the clockwise angle is positive in the OpenPCDet coordinates while the anticlockwise angle is positive in the kitti's ( x-axis as the start, both). That's why using l, w, h, (-np.pi / 2 + rotation_y).

Maybe the OpenPCDet coordinates I remembered was wrong (it's been a long time), you could find it.

OrangeSodahub avatar Jul 27 '22 08:07 OrangeSodahub

This is confusing!

The fact that the authors use rect_to_lidar to translate the 3D location to the Velodyne frame of reference here, we can tell that the kitti velodyne/lidar frame of reference is same as the openpcdet frame of reference.

Now referrring to the coordinate system illustrated in the bottom image here we can conclude that the coordinate system of openpcdet is as follows on the top. The camera system interpreted from the same diagram is the one shown below (bottom)

              x
              |   
              |  
              |  
              |  
              |  
     y--------z

KITTI Velodyne coordinate system
OpenPCDet coordinate frame.

              z
              |   
              |  
              |  
              |  
              |  
              y--------x

KITTI Camera coordinate system

Note here z points out of the screen and the coordinate system is being observed in bird's eye view / top view in case of the openpcdet/kitti velodyne system On the other hand for the KITTI camera coordinate system, the y is pointing inside the screen. This matches with the blog too.

Coming to the angle it is -(np.pi / 2 + rotation_y) and not (-np.pi / 2 + rotation_y) I am sorry but the math related to angle doesn't add up for me!

pk1996 avatar Jul 27 '22 22:07 pk1996

@pk1996 Yes, you're right. I checked relative code of my transformation of rotation_y angle, yes the kitti lidar frame is the same as the openpcdet frame. Kitti labels are in camera's frame. I did this long time ago, sorry about that.

OrangeSodahub avatar Jul 28 '22 02:07 OrangeSodahub

Glad my interpretation of the coordinate system is right!

Hope the repo maintainers can shed some light on the angle-related doubt!

BTW @OrangeSodahub did you consider modifying the anchor boxes in the model config?

pk1996 avatar Jul 28 '22 02:07 pk1996

Yes, I modified the box size in pv_rcnn.yaml Like pointrcnn.yaml has no parameter anchor boxes

OrangeSodahub avatar Jul 28 '22 06:07 OrangeSodahub

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Aug 28 '22 02:08 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 12 '22 02:09 github-actions[bot]

(Comment for self) Kitti coordinate system explained

This is confusing!

The fact that the authors use rect_to_lidar to translate the 3D location to the Velodyne frame of reference here, we can tell that the kitti velodyne/lidar frame of reference is same as the openpcdet frame of reference.

Now referrring to the coordinate system illustrated in the bottom image here we can conclude that the coordinate system of openpcdet is as follows on the top. The camera system interpreted from the same diagram is the one shown below (bottom)

              x
              |   
              |  
              |  
              |  
              |  
     y--------z

KITTI Velodyne coordinate system
OpenPCDet coordinate frame.

              z
              |   
              |  
              |  
              |  
              |  
              y--------x

KITTI Camera coordinate system

Note here z points out of the screen and the coordinate system is being observed in bird's eye view / top view in case of the openpcdet/kitti velodyne system On the other hand for the KITTI camera coordinate system, the y is pointing inside the screen. This matches with the blog too.

Coming to the angle it is -(np.pi / 2 + rotation_y) and not (-np.pi / 2 + rotation_y) I am sorry but the math related to angle doesn't add up for me!

adv010 avatar Jul 04 '23 12:07 adv010