nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

Problem using ORB-SLAM2 results instead of COLMAP for the data processing step

Open AruniRC opened this issue 3 months ago • 0 comments

Describe the bug Using the results from ORB-SLAM2 instead of COLMAP for a sequence from the TUM dataset (freiburg3_long_office_household) is resulting in incorrect camera poses when loaded into Nerfstudio: the trajectory looks correct in Viser and matches the ground-truth closely, but the cameras are all pointing the wrong way. Nerfacto training on this data is not able to converge to a reasonable solution due to this.

I have converted the outputs of ORB-SLAM to be in the same format that Nerfstudio converts COLMAP data to be in, including the coordinate conversion done in the applied_transform matrix in the transforms.json file. The per-frame poses are from ORB-SLAM2, following the convention X-right, Z-forward, Y-down, which is same as OpenCV.

To Reproduce Steps to reproduce the behavior:

  1. Please download the zipped filed containing the formatted SLAM results (Google drive link - 69.5 MB)
  2. Unzip to some local folder. The following should be present in that folder: transforms.json, sparse_pc.ply, images/*.png
  3. Run training on this folder: ns-train nerfacto --data /path/to/your/unzipped/folder
  4. Open Viser on your browser and check the camera poses.

Expected behavior I have verified that using COLMAP gives a good-quality trained Nerf on this same data - the incorrect pose reading issue only happens if I replace COLMAP with ORB-SLAM, and I am not sure what other conversions should be added from looking at the Nerfstudio docs.

Screenshots

From ORB-SLAM (trajectory correct, cameras pointing incorrectly): image

From COLMAP (trajectory similar, cameras pointing correctly - inwards): image

Please note that while COLMAP processing on the original TUM dataset sequence results in more frames that ORB-SLAM, the camera poses are of issue here.

Additional context The applied transform to map from OpenCV/COLMAP/ORB-SLAM convention to Nerfstudio/OpenGL:

 "applied_transform": [
        [
            1.0,
            0.0,
            0.0,
            0.0
        ],
        [
            0.0,
            0.0,
            1.0,
            0.0
        ],
        [
            -0.0,
            -1.0,
            -0.0,
            -0.0
        ]

Any help or suggestions - (perhaps more information in the docs for Nerfstudio users attempting such conversions?) - would be super helpful here!

Thanks!

AruniRC avatar Apr 22 '24 17:04 AruniRC