co3d
co3d copied to clipboard
Scale in co3d annotation file
Hi,
I'm trying to run NeRF on the scenes from your dataset. I started by looking at one frame in a scene in the category "hydrant": Using the FrameData, I extracted the R,T matrices and converted them to opencv format using https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/utils/camera_conversions.py#L65
So, this should give me the camera position of this specific frame. Looking at the extracted location of the camera (the T vector), I'm seeing large numbers that don't match the image (for example 7m even though the hydrant is very close to the camera as appears in the image). Also, the deph map of the frame and the point cloud of the scene shows large number that can't be real.
I assume there's some kind of scaling needed. How can I extract this scaling factor?
Thanks
Unfortunately we only crowd-sourced RGB videos without any depth signal or even knowledge of sensor calibration, so we cannot provide metric depth. The coordinate system was normalised for each scene separately (I think to have STD=1 of the SfM reconstructed point cloud).