gaussian-splatting icon indicating copy to clipboard operation
gaussian-splatting copied to clipboard

about projection_matrix

Open windkiss5 opened this issue 1 year ago • 7 comments

Hello author, thank you very much for your work. The code is very concise and elegant. I have a small question to ask you about projection matrices. Most of the camera reference matrices I come into contact with are like this(not like opengl's) : [[f_x, 0, u_0], [0, f_y, v_0], [0, 0, 1]]. 图片

If I manually specify a 'zfar', refer to your code in https://github.com/graphdeco-inria/gaussian-splatting/blob/d9fad7b3450bf4bd29316315032d57157e23a515/utils/graphics_utils.py#L51

can I directly convert it to

[2.0 * znear/W, 0.0, 0.0, 0.0]
[0.0, 2.0 * znear/H, 0.0, 0.0],
[0.0, 0.0, f/(f - n), -f * n/(f - n)]

?

So, I understand 'znear' as focal length. Is this process correct?

windkiss5 avatar Feb 22 '24 14:02 windkiss5

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

ShaohuaL avatar Feb 29 '24 03:02 ShaohuaL

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

I have one question: why is the formula provided here for calculating the projection matrix K different from the matrix given by OpenGL? Image 1 Image 2 The left image is from getProjectionMatrix, and the right image is from OpenGL. Looking forward to your response!

fzhiheng avatar Apr 17 '24 02:04 fzhiheng

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

I have one question: why is the formula provided here for calculating the projection matrix K different from the matrix given by OpenGL? Image 1 Image 2 The left image is from getProjectionMatrix, and the right image is from OpenGL. Looking forward to your response!

Hello, because they didn't refer to the setting of OpenGL, may be you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html

ShaohuaL avatar Apr 17 '24 03:04 ShaohuaL

@ShaohuaL Thanks for your response! I have read the link you provided. The main difference is the intervals for z mapping. I have taken the link to use 0-1 mapping, but there are still some differences with the final result. Here are my result: image I have tested my projection_matrix by cropping image to render. It works while the matrix in code fails.

def getProjectionMatrix2(znear, zfar, K, W, H):
    fx = K[0, 0]
    fy = K[1, 1]
    cx = K[0, 2]
    cy = K[1, 2]
    top = znear * cy / fy
    bottom = -znear * (H - cy) / fy
    right = znear * (W - cx) / fx
    left = -znear * cx / fx

    P = torch.zeros(4, 4)
    z_sign = 1.0

    P[0, 0] = 2.0 * znear / (right - left)
    P[1, 1] = 2.0 * znear / (top - bottom)
    P[0, 2] = -(right + left) / (right - left)
    P[1, 2] = (top + bottom) / (top - bottom)
    P[3, 2] = z_sign
    P[2, 2] = z_sign * zfar / (zfar - znear)
    P[2, 3] = -(zfar * znear) / (zfar - znear)

    return P

There is no problem in the source code because left = -right causes right + left to equal 0.

fzhiheng avatar Apr 18 '24 09:04 fzhiheng

@fzhiheng Hi! I think in your result, the element in (row 2, col 3) should also add a negative sign, i.e. -(t+b)/(t-b). Is that a typo?

LiuJF1226 avatar Apr 30 '24 10:04 LiuJF1226

@LiuJF1226 That's what it looks like. Note that the camera coordinate system used in the code is x-right, y-down, z-forward.

fzhiheng avatar May 02 '24 13:05 fzhiheng

@fzhiheng You are right. I didn't notice that in your code, top = znear * cy / fy and bottom = -znear * (H - cy) / fy. And my derivation is directly under the RDF camera coordinate system, where I set bottom = -znear * cy / fy and top = znear * (H - cy) / fy. Under this, the element in (row 2, col 3) should be -(t+b)/(t-b). Acctually both formulations are right.

LiuJF1226 avatar May 03 '24 12:05 LiuJF1226