gsplat icon indicating copy to clipboard operation
gsplat copied to clipboard

Add camera pose and projection gradient flow

Open jh-surh opened this issue 5 months ago • 13 comments

  • Added gradient backward flow to project_gaussians_backward_kernel for viewmat and projmat
  • Added a test script examples/test_pose_grad.py to test pose gradient update. The script interface is the same as simple_trainer.py:
python examples/test_pose_grad.py
python examples/test_pose_grad.py --img_path path/to/image
  • Validated on nerfstudio Splatfacto in this PR: https://github.com/nerfstudio-project/nerfstudio/pull/2885

jh-surh avatar Feb 08 '24 00:02 jh-surh

python examples/test_pose_grad.py --img_path path/to/image needs pytorch3D. Maybe we add the requirement to setup.py?

ichsan2895 avatar Feb 08 '24 14:02 ichsan2895

@ichsan2895, Good idea, but I think it may be better to add it to examples/requirements.txt

jh-surh avatar Feb 08 '24 15:02 jh-surh

It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.

kerrj avatar Feb 08 '24 19:02 kerrj

Also this would be simpler without the redundant projmat (see https://github.com/nerfstudio-project/gsplat/pull/97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.

oseiskar avatar Feb 09 '24 14:02 oseiskar

There's also a simpler way of achieving the same goal of optimizing the view matrix, shown here: https://github.com/nerfstudio-project/gsplat/pull/127

oseiskar avatar Feb 09 '24 17:02 oseiskar

Also this would be simpler without the redundant projmat (see #97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj

jh-surh avatar Feb 14 '24 15:02 jh-surh

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj

@jh-surh I elaborated this here https://github.com/nerfstudio-project/gsplat/pull/97#issuecomment-1951381539. Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)

oseiskar avatar Feb 18 '24 16:02 oseiskar

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj

@jh-surh I elaborated this here #97 (comment). Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)

Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.

It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.

Thank you for your suggestion! I found a bug trying to compare the gradient w.r.t. the torch implementation. I have fixed it and updated the test script and have passed the project gaussian test. Seems like the current main branch is failing 3 of the test scripts though. Someone should look at that.

jh-surh avatar Feb 19 '24 14:02 jh-surh

Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.

Yes, and I agree that such stuff should definitely be implemented in some other PR.

However, the projection matrix is fed to gsplat is not needed by Nersftudio for any other purpose, as demonstrated here https://github.com/SpectacularAI/nerfstudio/commit/bd234898725b1df46984a8f8b0925e2de43d4581 (works fine with #97).

My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).

This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to projmat @ viewmat in Nerfstudio, and this term depends on viewmat, which is the thing you actually want to modify in pose optimization.

Without the @ viewmat part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.

oseiskar avatar Feb 19 '24 16:02 oseiskar

My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).

This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to projmat @ viewmat in Nerfstudio, and this term depends on viewmat, which is the thing you actually want to modify in pose optimization.

Without the @ viewmat part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.

I gotta say this makes a lot of sense. Now that I think of it, this would cause the gradient for projmat to flow to viewmat, which I'm not sure is the right move.

jh-surh avatar Feb 21 '24 05:02 jh-surh

I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?

kerrj avatar Mar 20 '24 19:03 kerrj

I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?

I agree. Both are equivalent to the best of my understanding.

maturk avatar Mar 20 '24 19:03 maturk

@kerrj @maturk @oseiskar I left a comment on https://github.com/nerfstudio-project/gsplat/pull/127. TL;DR large difference between pure pytorch gradient and approximated v_viewmat.

jh-surh avatar Mar 21 '24 04:03 jh-surh