nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

Why is the world coord system rotated in colmap_utils.py?

Open jkulhanek opened this issue 1 year ago • 23 comments

What is the reason the world coord system is rotated here? https://github.com/nerfstudio-project/nerfstudio/blob/079f419e544b8d2aa6aad6f1a31decf0e06cb88c/nerfstudio/process_data/colmap_utils.py#L620-L621

Could we keep the coordinates as they are in COLMAP?

jkulhanek avatar Feb 25 '23 19:02 jkulhanek

COLMAP uses OpenCV camera conventions, but nerfstudio (and many other nerf impls, maybe all of them?) use OpenGL coordinate system. The code here is a bit obfuscated but it provides a precise translation.

FMI see discussions:

  • https://github.com/google-research/multinerf#making-your-own-loader-by-implementing-_load_renderings
  • https://github.com/NVlabs/instant-ngp/issues/72

pwais avatar Feb 25 '23 23:02 pwais

More info here - https://docs.nerf.studio/en/latest/quickstart/data_conventions.html#coordinate-conventions

tancik avatar Feb 25 '23 23:02 tancik

The two lines don’t change the camera coordinate system though, there is a line above the two which flips z,y to the from opencv to opengl. I thought that in colmap the orientation of world coordinate system is more or less arbitrary. Even when you run something like model_aligner the orientation is not defined. The orientation is overridden in the dataparser anyway, so it doesn’t really matter. I just thought it would be good to have a way to preserve the poses from colmap (when disabling the transformation code in the dataparser).

to the best of my knowledge, in multinerf (first referenced link) they don’t do it. They just switch from opencv to opengl here https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/datasets.py#L109

jkulhanek avatar Feb 26 '23 06:02 jkulhanek

you're right, the cited code in nerfstudio is not equivalent to the cited multinerf code.. at least not the precise cited lines. The cited conversion appears to do a handedness change but then also flips z:

array([[ 0, 1,  0,  0],
       [ 1, 0,  0,  0],
       [ 0, 0, -1,  0],
       [ 0, 0,  0,  1]])

(note I'm going to ignore re-scale / re-center)

So, COLMAP is right-handed, OpenGL camera model is right-handed, but OpenGL NDC is left-handed, no? np.diag([1, -1, -1, 1]) should account for the change in camera frames, but the nerfstudio code is doing a handedness too? this is very confusing because the viewer world frame indicator is definitely right-handed, at least assuming the RGB axes all indicate positive xyz

I believe the original LLFF code used in original nerf does a handedness change as well:

  • https://github.com/Fyusion/LLFF/blob/c6e27b1ee59cb18f054ccb0f87a90214dbe70482/llff/poses/pose_utils.py#L44
  • https://github.com/Fyusion/LLFF/blob/c6e27b1ee59cb18f054ccb0f87a90214dbe70482/llff/poses/pose_utils.py#L50

Note that the original nerf code also had a special "NDC" mode for forward-facing scenes, that's a little different than what we're discussing here I think:

  • https://github.com/bmild/nerf/blob/18b8aebda6700ed659cb27a0c348b737a5f6ab60/run_nerf.py#L310

And instant-ngp has it too:

  • https://github.com/NVlabs/instant-ngp/blob/9129d680bea305d457214c3e4ed45bca8ded6b86/scripts/colmap2nerf.py#L330

SDFStudio in contrast flips the camera frame back to OpenCV:

  • https://github.com/autonomousvision/sdfstudio/blob/1a4c961528be33b2affd9836614f45a487392061/scripts/datasets/process_nerfstudio_to_sdfstudio.py#L47

One way to break all this down is into three frames: world, camera (logical), and image plane ("sensor frame" or "screen frame"). If expected ray termination is supposed to be positive, and the screen / "sensor" frame has +x right and +y up, then that's a left-handed system, while the world and camera frames are right-handed.

So I think what we're seeing is that the Nerf model code is "leaking the abstraction of the screen space" into the data encoding? In multinerf it seems the screen space stuff is absorbed into the ray casting step: https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/camera_utils.py#L646

Now I'm curious if the camera paths that can be specified in the viewer for rendering... are those poses in a right-handed or left-handed system? ha

pwais avatar Feb 28 '23 10:02 pwais

The cited code does not change the handedness. It only rotates the world coordinate system, right?

jkulhanek avatar Feb 28 '23 10:02 jkulhanek

but in opengl camera, -z is forward / look at. yet cast rays have positive lengths. So isn’t that transform a handedness change, then a flip of z?

if you delete the code, can you get it to train?

On Tue, Feb 28, 2023 at 02:22 Jonáš Kulhánek @.***> wrote:

The cited code does not change the handedness. It only rotates the world coordinate system, right? I believe we can safely remove it without any issues?

— Reply to this email directly, view it on GitHub https://github.com/nerfstudio-project/nerfstudio/issues/1504#issuecomment-1447925156, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABVIIPNTMK3RYNMTYTZIY3WZXGXFANCNFSM6AAAAAAVIAJRME . You are receiving this because you commented.Message ID: @.***>

pwais avatar Feb 28 '23 10:02 pwais

If I delete the code, it trains fine. From what I understand, the only problem would be with forward-facing scenes?

jkulhanek avatar Feb 28 '23 10:02 jkulhanek

@pwais, hi! I am sorry for the stupid question, but could you explain again:

  1. Why do we need this conversion after the OpenCV -> OpenGL coordinate system conversion?
  2. What problem with forward-facing scenes does it solve, and why?

ShkarupaDC avatar Mar 26 '23 18:03 ShkarupaDC

I don't think there is a handedness change in these two lines, the first line changes the handedness and the next line changes it back. You can also verify this by calculating the determinant of the matrix array([[ 0, 1, 0, 0], [ 1, 0, 0, 0], [ 0, 0, -1, 0], [ 0, 0, 0, 1]]) (det=1 for sure) So it is merely a rotation of world frame, which I think has no influence on training.

qq456cvb avatar Jun 15 '23 17:06 qq456cvb

I've faced the same problem since I wanted to visualize the camera positions along with colmap point-cloud.

The difference between COLMAP and Nerfstudio format is just on the sign.

Nerfstudio: +X is right, +Y is up, and +Z is pointing back and away from the camera. -Z is the look-at direction.

Colmap: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image. (ref: colmap-doc)

I'm using this code instead

c2w = np.linalg.inv(w2c)
c2w[0:3, 1:3] *= -1
# that is it!

panchagil avatar Oct 11 '23 15:10 panchagil

@jkulhanek If you want to keep the world coordinate, this is the option

--assume_colmap_world_coordinate_convention=False \
--orientation_method=none \
--center_method=none \
--auto-scale-poses=False \

which expects this behavior in colmap parser.

jb-ye avatar Jan 19 '24 18:01 jb-ye

as i originally instigated this issue, i vote that https://github.com/nerfstudio-project/nerfstudio/pull/2793 effectively closes this one. @jkulhanek any thoughts?

pwais avatar Mar 11 '24 14:03 pwais

Actually I think @jb-ye https://github.com/nerfstudio-project/nerfstudio/pull/2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

pwais avatar Mar 11 '24 15:03 pwais

Actually I think @jb-ye #2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

This is the intended behavior. When the flag is turned on, an additional transform will be applied to world coordinates. Camera poses will be converted to OpenGL convention regardless of this flag.

jb-ye avatar Mar 11 '24 16:03 jb-ye

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of https://github.com/nerfstudio-project/nerfstudio/pull/2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

pwais avatar Mar 11 '24 22:03 pwais

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of #2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

just to clarify, the additional transform enabled by assume_colmap_world_coordinate_convention is always there from the very beginning. My PR just wrap it around a flag so we can optional turn it off.

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file. By default, nerfstudio will first apply a fixed transform to colmap world before re-orienting it using orientation method. The only way to disable this behavior is set both option to false/none.

The 3d point clouds are transformed in the same way as we did for cameras regardless whether those options are on/off. If you believe they are inconsistent, could you tell a bit more details about why you think so?

jb-ye avatar Mar 12 '24 15:03 jb-ye

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

pwais avatar Mar 14 '24 22:03 pwais

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

unfortunately the splatfacto model can not be exported to a different world frame because one can not rotate the spherical harmonics parameters. See https://github.com/nerfstudio-project/nerfstudio/pull/2951 for similar discussion.

In other words, if one wants to recover the model in original world coordinate, one has to choose between (1) disable any nerfstudio specific transforms during loading data or (2) export a metadata file along side ply exports that recording those transforms.

jb-ye avatar Mar 15 '24 02:03 jb-ye

@jb-ye The current issue with frames etc (and especially the really poor naming e.g. "transformation_matrix" that hides the actual frame name) has lead to a lot of confusion on Discord and I've seen is a top reason why people just use the original Inria impl. Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer, really it should be the viewer that has an option to find "logical gravity vector" and then nobody has to worry about e.g. spherical harmonics being "in the wrong frame."

pwais avatar Mar 15 '24 06:03 pwais

Spherical harmonics can easily be rotated using Winger matrices. I have a np code somewhere if you want it

jkulhanek avatar Mar 15 '24 09:03 jkulhanek

PlayCavas's implementation seems working great:

  • https://github.com/playcanvas/engine/blob/release-1.69/src/scene/gsplat/gsplat.js#L259

pfxuan avatar Mar 15 '24 13:03 pfxuan

@pwais Thanks for the explanation. Just want to summarize your proposal and confirm my understanding:

(1) You want to remove assume_colmap_world_coordinate_convention flag and somehow put this transform in viewer. As such, when a user specify orientation_method=none, the exported GS is truly the original input world coordinates. The current implementation requires the user to additionally specify assume_colmap_world_coordinate_convention=False when parsing colmap data. (2) The default setting of nerfstudio is orient, center, and scale the scene (The current splatfacto requires the input scene to be scaled to 1 in order to work properly). As such, when exporting GS ply in the original world coordinate, one has to figure out a way to rotate spherical harmonics (e.g. one suggested by @jkulhanek ) to respect the original world coordinate.

In my opinion, if we implemented feature (2), (1) becomes irrelevant anyway. We may prioritize https://github.com/nerfstudio-project/nerfstudio/pull/2951 .

jb-ye avatar Mar 15 '24 18:03 jb-ye

Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer

Not entirely, nerfstudio doc suggested that the world coordinate's Up direction is +Z, while it is often not the case for a colmap project (where Up direction is -Y). But I agree this is really a convenience hack for ns-viewer. We can completely remove the assume_colmap_world_coordinate_convention and have options in ns-viewer to select gravity direction.

jb-ye avatar Mar 15 '24 18:03 jb-ye