FoundationPose About object scaling

Sorry, I still have a question.

Why is it that for kinect_driller_seq and mustard0 in the demo, I did a mesh.apply_scale(scale_ratio), such as scale_ratio=1.5, and the entire estimate and tracking are directly wrong? I used the sample data you gave for the depth map.

Does this mean that the obj scale of the mesh must be consistent with the scale of the object in RGB? If I use a novel object, I am sure that my depth map is in mm, so the reconstructed obj must meet a specific scale?

In addition, I found that in your code, for the input RGB crop, the diameter of the obj is used for cropping. I think this is wrong. The diagonal length of the bbox in the mask should be used for cropping in RGB.

Aug 05 '24 03:08 wangguandongzheng

Does this mean that the obj scale of the mesh must be consistent with the scale of the object in RGB? If I use a novel object, I am sure that my depth map is in mm, so the reconstructed obj must meet a specific scale?

Yes, the scale is metric (in meter). In the demo data you dont need any more scaling because they are already in meter. Where did you get your depth map that does not have the correct scale? Are you using monocular RGB?

The cropping is correct. The fact that it's based on object diameter, helps to be robust to noisy segmentation/bounding-box.

Aug 07 '24 04:08 wenbowen123

Thank you for your reply. I am using the RealSense D435i. What I meant is that I only magnified the scale by 1.5 times, but the scale is still in metric (in meters), and the depth map used is the one you provided. What I mean is that the scale of the obj must be a certain multiple, regardless of whether it is metric or centimeter. Even if it is 1.5 times different, it will not work.

Aug 08 '24 02:08 wangguandongzheng

I use a novel object mesh(.obj file, diameter=400mm in NX(UG), actual diameter is also 400mm), my depth map(captured by a RGBD camera) is in mm, I have to magnify the mesh scale by 0.02 times to get a relative correct pose estimation, and the pose is still very coarse.

Aug 09 '24 02:08 affection123456

I use a novel object mesh(.obj file, diameter=400mm in NX(UG), actual diameter is also 400mm), my depth map(captured by a RGBD camera) is in mm, I have to magnify the mesh scale by 0.02 times to get a relative correct pose estimation, and the pose is still very coarse.

when running, you should use the scale in meter

Aug 15 '24 06:08 wenbowen123

Thanks! I got the right pose.

Aug 19 '24 07:08 affection123456

@ZhengWangguandong Hello! what do you mean that "the obj scale of the mesh must be consistent with the scale of the object in RGB"？ In run_demo.py, I tried to print the scale (bbox) of given CAD mesh and "scene_complete.ply"(under the "if debug>=3", reconstructed with depth,mask and K). But their scales are very different. I wonder why. I think their scale should be the same. I am looking forward to your reply very much！Thanks

Apr 15 '25 03:04 1270645409

@ZhengWangguandong Hello! what do you mean that "the obj scale of the mesh must be consistent with the scale of the object in RGB"？ In run_demo.py, I tried to print the scale (bbox) of given CAD mesh and "scene_complete.ply"(under the "if debug>=3", reconstructed with depth,mask and K). But their scales are very different. I wonder why. I think their scale should be the same. I am looking forward to your reply very much！Thanks

Yes, their scales should be the same. Now I solved this problem. Refer to https://github.com/NVlabs/FoundationPose/issues/329#issuecomment-2817041456

Apr 20 '25 07:04 1270645409