gsplat
gsplat copied to clipboard
Add camera pose and projection gradient flow
- Added gradient backward flow to
project_gaussians_backward_kernel
forviewmat
andprojmat
- Added a test script
examples/test_pose_grad.py
to test pose gradient update. The script interface is the same assimple_trainer.py
:
python examples/test_pose_grad.py
python examples/test_pose_grad.py --img_path path/to/image
- Validated on nerfstudio Splatfacto in this PR: https://github.com/nerfstudio-project/nerfstudio/pull/2885
python examples/test_pose_grad.py --img_path path/to/image
needs pytorch3D
. Maybe we add the requirement to setup.py
?
@ichsan2895, Good idea, but I think it may be better to add it to examples/requirements.txt
It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.
Also this would be simpler without the redundant projmat
(see https://github.com/nerfstudio-project/gsplat/pull/97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.
There's also a simpler way of achieving the same goal of optimizing the view matrix, shown here: https://github.com/nerfstudio-project/gsplat/pull/127
Also this would be simpler without the redundant
projmat
(see #97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.
I'm not certain if removing the projmat
is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step.
What do you guys think? @ichsan2895 @kerrj
I'm not certain if removing the
projmat
is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj
@jh-surh I elaborated this here https://github.com/nerfstudio-project/gsplat/pull/97#issuecomment-1951381539. Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)
I'm not certain if removing the
projmat
is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj@jh-surh I elaborated this here #97 (comment). Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)
Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.
It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.
Thank you for your suggestion! I found a bug trying to compare the gradient w.r.t. the torch implementation. I have fixed it and updated the test script and have passed the project gaussian test. Seems like the current main branch is failing 3 of the test scripts though. Someone should look at that.
Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.
Yes, and I agree that such stuff should definitely be implemented in some other PR.
However, the projection matrix is fed to gsplat is not needed by Nersftudio for any other purpose, as demonstrated here https://github.com/SpectacularAI/nerfstudio/commit/bd234898725b1df46984a8f8b0925e2de43d4581 (works fine with #97).
My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).
This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to projmat @ viewmat
in Nerfstudio, and this term depends on viewmat
, which is the thing you actually want to modify in pose optimization.
Without the @ viewmat
part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.
My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).
This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to
projmat @ viewmat
in Nerfstudio, and this term depends onviewmat
, which is the thing you actually want to modify in pose optimization.Without the
@ viewmat
part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.
I gotta say this makes a lot of sense. Now that I think of it, this would cause the gradient for projmat
to flow to viewmat
, which I'm not sure is the right move.
I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?
I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?
I agree. Both are equivalent to the best of my understanding.
@kerrj @maturk @oseiskar I left a comment on https://github.com/nerfstudio-project/gsplat/pull/127.
TL;DR large difference between pure pytorch gradient and approximated v_viewmat
.