rgbd-pose3d
rgbd-pose3d copied to clipboard
RGB to depth alignment
Hey there,
Thanks for your work!
I'm currently facing a problem of aligning depth map and rgb frame from your dataset.
I'm currently following an idea of projecting depth map to camera, then transform it to the world coordinate and the project it back to the color image plane. For the sake of testing, I've tried doing this without paying attention to depth and trying using the API you've provided as much as possible.
So the steps I'm doing right now
- Project from depth pixel space to depth camera with my own function
def project_from_view(depth, camid, calib_data):
""" project from pixel plane to the camera coordinates """
depth_map = np.array(depth)
intrinsics = calib_data[camid][0]
focals, opt_center = np.eye(3), np.eye(3)
focals[[0, 1], [0, 1]] = [intrinsics[0][0], intrinsics[1][1]]
opt_center[[0, 1], [2, 2]] = [intrinsics[0][2], intrinsics[1][2]]
pts = np.vstack((np.where(depth_map != None), np.full(depth_map.shape[0] * depth_map.shape[1], -1)))
pts = np.linalg.inv(focals).dot(opt_center.dot(pts))
pts[2, :] = 1
pts = pts * depth_map.flatten()
return pts.T
-
Projecting to world coordinate frame with the function from your API:
trafo_cam2world(pts_cam_d, depth_cam_id, calib_frame)
-
Projecting from world to image plane with the function from your API:
project_from_world_to_view(pts_world, color_cam_id, calib_frame)
After applying those steps I get the following map:
The depth map at the bottom right is shifted, but I'm sure that I've used the correct intrinsics and extrinsics.
Moreover, I've tested my projection function separately and do not think that there might be an issue here.
Is it a normal behavior? Still hard to believe that reprojection gives such a big displacement while 3D points are perfectly mapped and aligned.
I have the same question.
Hey there,
Thanks for your work!
I'm currently facing a problem of aligning depth map and rgb frame from your dataset.
I'm currently following an idea of projecting depth map to camera, then transform it to the world coordinate and the project it back to the color image plane. For the sake of testing, I've tried doing this without paying attention to depth and trying using the API you've provided as much as possible.
So the steps I'm doing right now
- Project from depth pixel space to depth camera with my own function
def project_from_view(depth, camid, calib_data):
""" project from pixel plane to the camera coordinates """
depth_map = np.array(depth)
intrinsics = calib_data[camid][0]
focals, opt_center = np.eye(3), np.eye(3)
focals[[0, 1], [0, 1]] = [intrinsics[0][0], intrinsics[1][1]]
opt_center[[0, 1], [2, 2]] = [intrinsics[0][2], intrinsics[1][2]]
pts = np.vstack((np.where(depth_map != None), np.full(depth_map.shape[0] * depth_map.shape[1], -1)))
pts = np.linalg.inv(focals).dot(opt_center.dot(pts))
pts[2, :] = 1
pts = pts * depth_map.flatten()
return pts.T
- Projecting to world coordinate frame with the function from your API:
trafo_cam2world(pts_cam_d, depth_cam_id, calib_frame)
- Projecting from world to image plane with the function from your API:
project_from_world_to_view(pts_world, color_cam_id, calib_frame)
After applying those steps I get the following map:
The depth map at the bottom right is shifted, but I'm sure that I've used the correct intrinsics and extrinsics.
Moreover, I've tested my projection function separately and do not think that there might be an issue here.
Is it a normal behavior? Still hard to believe that reprojection gives such a big displacement while 3D points are perfectly mapped and aligned.
Now,do you have a answer to this question?
I also encountered this problem. Later, after many tests, I found that the problem appeared in the confusion of the coordinate order of X and y. The general image reading dimension is (high, wide, 3), corresponding coordinates (y, x, 3), not (x, y, 3).
I also encountered this problem. Later, after many tests, I found that the problem appeared in the confusion of the coordinate order of X and y. The general image reading dimension is (high, wide, 3), corresponding coordinates (y, x, 3), not (x, y, 3).
Your result seems quite well. I also want to test on MKV dataset. Could you please tell me how you get the dataset?