on VGGT with RGB-D depth maps (apparent translation in pose)
Dear Authors,
Thank you for the excellent work and the great presentation at CVPR. I’m evaluating VGGT for camera pose estimation in a low-overlap setting. With RGB images from my RGB-D sensor, the recovered poses look correct: RGB result (looks good, see screenshot):
However, when I use the aligned depth maps from the same sensor (registered to the color camera), I observe an apparent translation/shift in the reconstructions:
Depth-based result (shows the translation effect I'm talking about, see screenshot):
Do you have any recommendations on what might cause this and how to address it? For context, the depth camera is factory-registered to the color camera. I use the color intrinsics for both (depth and color) and the extrinsics are identity between aligned depth and color.
I’d be grateful for pointers. Thanks again for the great work !