labelCloud icon indicating copy to clipboard operation
labelCloud copied to clipboard

About KITTI format label

Open deemoo-wang opened this issue 3 years ago • 1 comments

I want to know that how to make the transformation between velodyne coordinate and camera coordinate. I find that in the source code, the transformation is "centroid = (-centroid[1], centroid[2] + 2.3, centroid[0])" which means that camera coord velodyne coord x = -y y = z+2.3 z = x

I want to know why the transformation is defined like this. As I know that in KITTI dataset the coordinate are defined bellow: Camera: x = right, y = down, z = forward Velodyne: x = forward, y = left, z = up and they use callib data to transform the coordinate specifically.

deemoo-wang avatar Apr 28 '22 07:04 deemoo-wang

Hello @deemoo-wang,

I think this question is best addressed to @sondisonda who provided this transformation as I asked the same question.

So afaik the 2.3m is the distance from floor to the camera but I would also like to have an official specification for this.

ch-sa avatar Apr 30 '22 20:04 ch-sa

@ch-sa 3d bboxes locations in KITTI labels are given in camera frame (not LIDAR frame, unfortunately for us), so we have to calculate camera-to-lidar transformation manually for each pointcloud+label pair (using corresponding calib file).

My result with open3d (I can send code, if needed): image

There are specific steps for each label:

  1. Get R0_rect and Tr_velo_to_cam fields from corresponding calib file and convert + expand them up to transformation matrices (i.e. with np.concatenate function) to make them (4x4) shape. Lets call them T_rect and T_v2c respectively.
  2. Calculate lidar-to-camera transformation: T_l2c = T_rect @ T_v2c.
  3. Calculate camera-to-lidar transformation: T_c2l = np.linalg.inv(T_l2c) (or more fancy inverse)
  4. Extract 3d bbox location from label (lets call it loc) and append ones to make it homogeneous: [x, y, z] -> [x, y, z, 1]
  5. Transform 3d bbox location to lidar frame via our T_c2l: loc = T_c2l @ loc.
  6. Repeat 1-5 for each label+calib file pairs in dataset.

Note: the main idea is that camera-to-lidar transformation would be different for all clouds (cloud+label+calib), so having one hardcoded transformation for whole dataset (like it is now) is incorrect.

Divelix avatar Jan 26 '23 12:01 Divelix

@Divelix thanks for the detailed clarification.

So basically we would need to support reading the calibration files for every point cloud, right?

Are you interested in creating a PR for this, I can support you in the integration once the base functions are there.

I probably won't find time to address this soon.

Best Christoph

ch-sa avatar Jan 26 '23 20:01 ch-sa

@ch-sa I just created PR with solution, but I assumed that calib folder is always next to labels. It is probably better to add folder selection to config.ini or even settings section of GUI.

Divelix avatar Jan 28 '23 16:01 Divelix

Great, will have a look at it tomorrow!

And I can then add the config option.

Do you have an example point cloud with calib and label file you could provide, so I can test?

ch-sa avatar Jan 28 '23 17:01 ch-sa

@ch-sa Yes, there are first 11 samples from KITTI: kitti_10.zip. The last one (000010) is on my screenshot above.

Divelix avatar Jan 29 '23 09:01 Divelix

@ch-sa this issue can be closed now, isn't it?

Divelix avatar Feb 01 '23 07:02 Divelix

Exactly, thanks!

ch-sa avatar Feb 01 '23 19:02 ch-sa