hope-dataset
hope-dataset copied to clipboard
Object pose of HOPE-Video
Hi @swtyree @Uio96 @sbirchfield,
Thanks for sharing the HOPE cad models and dataset!
My question is, when I try to project object pose back to the world frame from different scenes, I found that their pose in world frame are not the same, which means the pose has some errors. So is this error acceptable?
Thanks.
Hi @Jing-lun, thanks for spotting this. I think the issue is an error in camera extrinsics for HOPE-Video where the translation units appear to be in meters, while object poses are in cm. I'll confirm this and update the files later today.
Hi @swtyree, thanks for your prompt reply and let me know!
I tested again and even though I make the units consistent, the 3D pose still cannot be matched.
I tested the pose of Mac&Cheese model in the first and the last view in scene_0000, and below is my calculation.
'''camera_extrinsic1 and pose1 are from hope_video/scene_0000/0000.json'''
camera_extrinsic1 = np.asarray([[
-0.9886373,
-0.14978693,
0.012654976,
79.977846
],[
-0.1278811,
0.7938205,
-0.59455484,
-32.258067
],[
0.07901077,
-0.5894174,
-0.8039555,
23.390512
],[
0.0,
0.0,
0.0,
1.0
]])
pose1 = np.asarray([[
-0.20787001630983457,
-0.9763291480689646,
0.059761308091495956,
0.13647988469971395
],[
-0.7948878577485222,
0.13300297015968482,
-0.5919996280969145,
-21.892505447000136
],[
0.5700380023040957,
-0.17056251735382824,
-0.8037195436940463,
55.94770750870245
],[
0.0,
0.0,
0.0,
1.0
]])
'''camera_extrinsic2 and pose2 are from hope_video/scene_0000/0364.json'''
camera_extrinsic2 = np.asarray([[
-0.8754454,
-0.45876563,
-0.15208338,
58.14929
],[
-0.2247508,
0.6649921,
-0.7122307,
-19.330604
],[
0.4278812,
-0.58933824,
-0.6852723,
6.31446
],[
0.0,
0.0,
0.0,
1.0
]])
pose2 = np.asarray([[
0.11922886974591602,
-0.9869201213152595,
-0.10850370274050081,
-7.430878036322042
],[
-0.709816306693385,
-0.008316097101108606,
-0.7043377604400789,
-12.936461380072812
],[
0.6942227429692126,
0.1609950503421291,
-0.7015235871501129,
62.88667563186634
],[
0.0,
0.0,
0.0,
1.0
]])
'''Tow = Toc*Tcw'''
world1 = pose1.dot(camera_extrinsic1)
world2 = pose2.dot(camera_extrinsic2)
Okay, thanks for the update. Have you confirmed that this issue is only with HOPE-Video and not HOPE-Image?
Okay, thanks for the update. Have you confirmed that this issue is only with HOPE-Video and not HOPE-Image?
Well, all the objects in the HOPE-Image folder stay still and have no translation and rotation (I think the only difference in HOPE-Image is the lighting condition), so I cannot use the same way to check if the object pose in the world frame is the same or not.
I think I figured out the issues:
- As we already established, the translation in the camera extrinsic matrix was in
m
, while object poses are incm
. - The extrinsic matrix is actually world-to-camera, rather than camera-to-world as you expected (and as I also expected until I dug into it). In the line preview.py#L112, the extrinsic matrix is used to transform the scene reconstruction point cloud from world coordinates to camera coordinates.
To project a pose from camera to world coordinates, use this for now:
extrinsics_w2c[:3,-1] *= 100 # correct translation units from m to cm
pose_world = np.linalg.inv(extrinsics_w2c) @ pose_camera
I'll update the documentation in the README, and I may upload a new version with more explicit key names in the json files. But I'll need to do that at a later time.
Thanks again for reaching out with the issue!
Thanks a lot @swtyree! Now the poses are matched!
Thanks! I'm going to reopen the issue until I can get a new version of the annotations uploaded to Google Drive.