gaussian-splatting icon indicating copy to clipboard operation
gaussian-splatting copied to clipboard

How to render orthographic projection views?

Open haksorus opened this issue 1 year ago • 28 comments

Hello! I would like to render some ortho-views of my scene using Orthographic projection (https://en.wikipedia.org/wiki/Orthographic_projection).

For this purpose, I changed the getProjectionMatrix() function according to the Orthographic projection matrix: image

def getProjectionMatrix(znear, zfar, fovX, fovY):
    tanHalfFovY = math.tan((fovY / 2))
    tanHalfFovX = math.tan((fovX / 2))

    top = tanHalfFovY * znear
    bottom = -top
    right = tanHalfFovX * znear
    left = -right

    P = torch.zeros(4, 4)

    z_sign = 1.0

    P[0, 0] = 2.0 / (right - left)
    P[0, 3] = - (right + left) / (right - left)
    P[1, 1] = 2.0 / (top - bottom)
    P[1, 3] = - (top + bottom) / (top - bottom)
    P[2, 2] = -2.0 / (zfar - znear)
    P[2, 3] = - (zfar + znear)/(zfar - znear)
    P[3, 3] = z_sign

    return P

But as a result, I always get a black picture What could be done to solve this problem?

haksorus avatar Dec 26 '23 12:12 haksorus

In the code, they use the perspective projection matrix, which can make the size vary inversely with distance. However, orthographic projection cannot do that. In their cuda code, they get the depth information from the perspective projection matrix to realize the rendering process. Orthographic projection cannot provide depth information. Therefore, if you want to make orthographic projection work, you have to change the cuda code part.

Dehaoq avatar Dec 26 '23 20:12 Dehaoq

@haksorus hello, i encountered the same problem, i want to render a top-down orthographic projection view. Did you figure out the issue?

hot-dog avatar Feb 01 '24 02:02 hot-dog

In the code, they use the perspective projection matrix, which can make the size vary inversely with distance. However, orthographic projection cannot do that. In their cuda code, they get the depth information from the perspective projection matrix to realize the rendering process. Orthographic projection cannot provide depth information. Therefore, if you want to make orthographic projection work, you have to change the cuda code part.

@Dehaoq I adopt haksorus's Orthographic projection matrix but made little modification when calcuating top, bottom, right, left value, i use zfar instead of znear. By doing so, i get some reasonable results as follows, the result is orthographic since building's facade is invisible,but the result is foggy, i think it is due to the lack of depth info. Could you please give some suggestions for changing the cuda code part to incoprate depth info? Than you! image

hot-dog avatar Feb 07 '24 07:02 hot-dog

@hot-dog Hi, Have you achieved the Orthographic projection with GS? Looking forward to your update. Thank you!

GANWANSHUI avatar Feb 14 '24 10:02 GANWANSHUI

@hot-dog Hi, Have you achieved the Orthographic projection with GS? Looking forward to your update. Thank you!

sorry, i have not solved the problem mentioned above, i am still trying.

hot-dog avatar Feb 17 '24 01:02 hot-dog

@hot-dog Hi, Have you achieved the Orthographic projection with GS? Looking forward to your update. Thank you!

sorry, i have not solved the problem mentioned above, i am still trying.

looking forward to the update if any, thanks a lot in advance!

GANWANSHUI avatar Feb 18 '24 14:02 GANWANSHUI

@hot-dog Hi, Have you achieved the Orthographic projection with GS? Looking forward to your update. Thank you!

sorry, i have not solved the problem mentioned above, i am still trying.

looking forward to the update if any, thanks a lot in advance!

I have achieved clear orthographic rendering. The original code use EWA splatting, which assumes perspective projection to project 3D gaussian to 2D gaussian. However, perspective projection causes distortion at non-central positions, leading to the loss of some good characteristics after projecting 3D gaussian to 2D, so in diff-gaussian-rasterization/cuda_rasterizer/forward.cu, the computeCov2D function use some kind of approximation method to avoid such loss, i.e the J matrix, according to my understanding, orthographic projection does not cause distortion, so i replace the J matrix with a diagonal matrix as follows and it works! image

I am not very clear on the underlying principles yet, since my math foundation is weak, i dont konw how to derive the exact J matrix. If you can help, it will be very appreciated:)

hot-dog avatar Mar 04 '24 09:03 hot-dog

Orthographic

hi, did your train with Orthographic views or just test rendering with Orthographic views ? @hot-dog Looking forward to your reply, thanks!

cv-lab-x avatar Apr 02 '24 07:04 cv-lab-x

Orthographic

hi, did your train with Orthographic views or just test rendering with Orthographic views ? @hot-dog Looking forward to your reply, thanks!

I am testing rendering with orthographic projection, the training must be perspective projection since the training images are taken with pinhole camera, if training with orthographic projection, it will not converge.

hot-dog avatar Apr 02 '24 07:04 hot-dog

Orthographic

hi, did your train with Orthographic views or just test rendering with Orthographic views ? @hot-dog Looking forward to your reply, thanks!

I am testing rendering with orthographic projection, the training must be perspective projection since the training images are taken with pinhole camera, if training with orthographic projection, it will not converge.

thanks, what's the meaning of 150 in the J matrix you modified?

cv-lab-x avatar Apr 02 '24 12:04 cv-lab-x

Orthographic

hi, did your train with Orthographic views or just test rendering with Orthographic views ? @hot-dog Looking forward to your reply, thanks!

I am testing rendering with orthographic projection, the training must be perspective projection since the training images are taken with pinhole camera, if training with orthographic projection, it will not converge.

thanks, what's the meaning of 150 in the J matrix you modified?

I tried several values and get best result when it is 150. The value is experimentally and should vary with different scene and camera pose. As i said in the early post, my math is poor, i dont kown how to derive the exact J matrix, it should be something related to current processing gaussian point's parameters. If you could help the derivation of J matrix or give some suggestions, it would be very appreciate!:)

hot-dog avatar Apr 03 '24 01:04 hot-dog

Hi, any update now? I mean if there's some way to get orthographic projection test views without a experimental value? @hot-dog

boqian-li avatar Apr 09 '24 01:04 boqian-li

Hi, any update now?

boqian-li avatar Apr 17 '24 20:04 boqian-li

Spline can render with orthogonal camera. But no ideas about how to do it https://www.reddit.com/r/Spline3D/comments/184e9df/spline_tip_use_the_perspective_camera_when/

lieoojinyi avatar Apr 29 '24 07:04 lieoojinyi

achieved clear orthographic rendering.

Hi buddy,have you achieved orthographic rendering?

gwen233666 avatar Jun 06 '24 13:06 gwen233666

I have successfully achieved clear orthogonal projection by modifying the projection matrix and j matrix !It can be seen that the proportion of buildings in the picture from top to bottom is consistent, not like the perspective projection, which is close to big and far from small. 0037 (4)

wangyicxy avatar Jul 03 '24 06:07 wangyicxy

I have successfully achieved clear orthogonal projection by modifying the projection matrix and j matrix !It can be seen that the proportion of buildings in the picture from top to bottom is consistent, not like the perspective projection, which is close to big and far from small. 0037 (4)

I feel that the projection quality is still somewhat degraded. Could you please post a group of pictures for comparison? It would be sweeter to share this dataset! Thank you.

gwen233666 avatar Jul 03 '24 07:07 gwen233666

I have successfully achieved clear orthogonal projection by modifying the projection matrix and j matrix !It can be seen that the proportion of buildings in the picture from top to bottom is consistent, not like the perspective projection, which is close to big and far from small. 0037 (4)

I feel that the projection quality is still somewhat degraded. Could you please post a group of pictures for comparison? It would be sweeter to share this dataset! Thank you.

Okay, this is the result of perspective projection, it seems to have been magnified, and I'm not sure if it's a normal phenomenon. 0037 (1)

wangyicxy avatar Jul 03 '24 07:07 wangyicxy

I have successfully achieved clear orthogonal projection by modifying the projection matrix and j matrix !It can be seen that the proportion of buildings in the picture from top to bottom is consistent, not like the perspective projection, which is close to big and far from small. 0037 (4)

I feel that the projection quality is still somewhat degraded. Could you please post a group of pictures for comparison? It would be sweeter to share this dataset! Thank you.

Okay, this is the result of perspective projection, it seems to have been magnified, and I'm not sure if it's a normal phenomenon. 0037 (1)

Hi, could you please share your modified projection matrix and j matrix?

YihangChen-ee avatar Jul 16 '24 06:07 YihangChen-ee

Orthographic

hi, did your train with Orthographic views or just test rendering with Orthographic views ? @hot-dog Looking forward to your reply, thanks!

I am testing rendering with orthographic projection, the training must be perspective projection since the training images are taken with pinhole camera, if training with orthographic projection, it will not converge.

thanks, what's the meaning of 150 in the J matrix you modified?

I tried several values and get best result when it is 150. The value is experimentally and should vary with different scene and camera pose. As i said in the early post, my math is poor, i dont kown how to derive the exact J matrix, it should be something related to current processing gaussian point's parameters. If you could help the derivation of J matrix or give some suggestions, it would be very appreciate!:)

Following the notation used in Wikipedia (https://en.wikipedia.org/wiki/Orthographic_projection) as mentioned by @haksorus , orthogonal projection can be represented by the matrix operation below: image Let denote the resulting coordinates on the right side with a prime symbol ('). Then, the Jacobian matrix is: image where we set the third row to 0, consistent with computeCov2D() in submodules/diff-gaussian-rasterization/cuda_rasterizer/forward.cu . Moreover, computeCov2D() requires the cov to be scaled to pixel space. Finally, we get the Jacobian matrix as: image Intuitively, this leads to the inclusion of a large value, such as 150 https://github.com/graphdeco-inria/gaussian-splatting/issues/578#issuecomment-1976123500, in the Jacobian matrix. By modifying the projection matrix as suggested in https://github.com/graphdeco-inria/gaussian-splatting/issues/578#issue-2056350773 and substituting the newly derived J for the original, I obtained a sharp image under orthogonal projection.

Regarding @Dehaoq 's concerns about depth, further discussion may be necessary.

Sapphire-356 avatar Aug 27 '24 14:08 Sapphire-356