Viser's perspective projection
Viser's perspective projection
I guess Viser uses perspective projection, then changes to Orthographics projection, and performs rendering. If an object is in ftustum, it will be rendered; if an object is not in ftustum, it will not be rendered.
Based on my speculation, I used CameraHandle.position,
CameraHandle.fov,
and CameraHandle.aspect.
Then I set up a near plane and far plane. I used the above 5 parameters to creat frustum (red part).
Now I am using Nefstudio's gaplat algorithm for training and have obtained many CKPT files. I generate a bounding box for each ckpt.If frustum intersects with the bounding box, I will read the ckpt. If frustum does not intersect with the bounding box, I will not read ckpt. Now I have discovered a bug: I use nerfview( https://github.com/hangg7/nerfview ). When using the left arrow or right arrow on the keyboard to rotate, there may be some missing data in the scene. Because the bounding box of some data does not intersect with the frustum, so it did not participate in rendering.
I drew four points ABCD of the far plane in browser
Theoretically, points ABCD are located at the four corners of the screen.
But when I use the left arrow or right arrow on the keyboard, there are only 2 dots on the screen. Points B and D are outside.
My guess is that when I use the left arrow or right key on the keyboard, Frustum rotates, but something doesn't rotate together, causing misalignment, so some data is not displayed.
I tried my best to solve the bug, but I failed. Perhaps because I didn't know some key information about Viser. Can you help me? Thank you very much.
Hello!
Yeah, unfortunately this just seems like a bug in the frustum math on your end. Maybe double-check that you're using the camera rotation correctly when you compute the frustum bounds? From what you've drawn it looks like the frustum you've computed is just not rotating correctly with the camera.
You are right,thank you.
when I use the left arrow or right key on the keyboard, something rotate, but Frustum rotates doesn't together, causing misalignment, so some data is not displayed.
This means that during rotation, frustum maintains its position without moving.
So the fundamental reason is that the 5 parameters(CameraHandle.position,
CameraHandle.fov,CameraHandle.aspect, a near plane and far plane) I used are not enough.
These 5 parameters can draw many different directions of frustum.
I should consider rotation. Do you know how to obtain this parameter in Viser? Thank you very much.
Makes sense! Yeah, the rotation is stored as a quaternion here: https://viser.studio/latest/camera_handles/#viser.CameraHandle.wxyz
Thank you very much for your answer, but I am unable to use quaternions appropriately.
The way I get the coordinates of frustum:
1.To simplify the problem, I use Rectangular Pyramid instead of Frustum
2. Draw an upward Rectangular Pyramid with its vertices located at camera.position
3. Generate a look- vector=camera_date.luok_at-camera_stance. position
4. Calculate the rotation matrix based on (0,0,1) and the look-at vector.
5. Apply the rotation matrix to the four base vertices of the Rectangular Pyramid. look-at vector becomes the rotation axis of the Rectangular Pyramid.
The problem is that there can be countless Rectangular Pyramids on a rotation axis, and I need a rotation angle to determine the unique Rectangular Pyramid.
MY code: def rotation_matrix_from_vectors(vec1, vec2): """ Find the rotation matrix that aligns vec1 to vec2 :param vec1: A 3d "source" vector :param vec2: A 3d "destination" vector :return mat: A transformation matrix that aligns vec1 with vec2
a, b = (vec1 / np.linalg.norm(vec1)).reshape(3), (vec2 / np.linalg.norm(vec2)).reshape(3)
v = np.cross(a, b)
c = np.dot(a, b)
s = np.linalg.norm(v)
if s == 0: # the vectors are parallel
if c > 0:
return np.identity(3)
else:
return -np.identity(3)
I = np.identity(3)
Vx = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]])
R = I + Vx + np.dot(Vx, Vx) * ((1 - c) / (s ** 2))
return R
def draw_frustum( x_center, y_center, z_center, fov, aspect_ratio, far, direction,wxyz=None): fov_rad = fov#
far_height = 2 * np.tan(fov_rad / 2) * far
far_width = far_height * aspect_ratio
# Far plane vertices
far_plane = np.array([
[-far_width / 2, -far_height / 2, far],
[far_width / 2, -far_height / 2, far],
[far_width / 2, far_height / 2, far],
[-far_width / 2, far_height / 2, far]
])
far_plane += np.array([x_center, y_center, z_center])
# print(far_plane.shape)
# Rotation to align the frustum with the direction vector
direction = np.array(direction)
forward = np.array([0, 0, 1]) # Initial forward direction
R = rotation_matrix_from_vectors(forward, direction)
rotated_base_points = []
for point in far_plane:
vector = point - np.array([x_center, y_center, z_center])
rotated_vector = R @ vector
rotated_point = rotated_vector + np.array([x_center, y_center, z_center])
rotated_base_points.append(rotated_point)
far_plane=np.stack(rotated_base_points,axis=0)
# # Create the sides of the frustum
camera_position=np.array([x_center, y_center, z_center])
frustum = np.vstack((camera_position, far_plane))
How do I use quaternions correctly? Or should I choose a different way of drawing a Rectangular Pyramid? Or does Viser have a parameter that provides me with the rotation angle for the fifth step? Thank you very much.
Okay interesting!
Instead of trying to rotate the frustum into the world frame, for me it's easier to visualize leaving the frustum at the camera frame origin but then transform the Gaussians so they're defined with respect to the camera. My high-level recommendation would be to:
- Ignore the
look_at+up_directionattributes of the camera. Usingwxyzis probably easier. - Convert
camera.wxyzandcamera.positionto a single rigid transform.T_cam_world = vtf.SE3.from_rotation_and_translation(vtf.SO3(wxyz), position).inverse()can give you the necessary transformation to put world-frame coordinates into the camera frame. [^1] - Use
T_cam_worldto compute camera-frame Gaussian centers from world-frame Gaussian centers. pseudocode:centers_wrt_cam = T_cam_world @ centers_wrt_world - Define frustum using fov parameters, with +Z forward, +X right, +Y down. Check whether the points are in the frustum.
[^1]: note that wxyz and position for the camera both correspond to T_world_cam, so we need to invert. https://viser.studio/latest/conventions/#poses is also relevant.
thank you
Are you saying that I transform the position of a Gaussian from the world coordinate system to the camera coordinate system? Then determine whether the Gaussian point of the camera coordinate system is inside or outside the frustum of the camera coordinate system?
So the question is how do I get a precise location of the frustum in the camera coordinate system, like in this picture. the Gaussian point is in the black frustum but not in the red frustum.I still need rotation.
Or is there no rotation in the frustum of the camera coordinate system?like in this picture.
I can directly write the coordinates,like this:
far_plane = np.array([
[-far_width / 2, -far_height / 2, far],
[far_width / 2, -far_height / 2, far],
[far_width / 2, far_height / 2, far],
[-far_width / 2, far_height / 2, far]
])
thank you
I found that the photos added by add_camera_frustum in ns-viewer seem to overlap with the results rendered by splatfacto, perhaps the way frustum is drawn in add_camera_frustum is helpful to me.
Are you saying that I transform the position of a Gaussian from the world coordinate system to the camera coordinate system? Then determine whether the Gaussian point of the camera coordinate system is inside or outside the frustum of the camera coordinate system?
Yes!
So the question is how do I get a precise location of the frustum in the camera coordinate system, like in this picture. the Gaussian point is in the black frustum but not in the red frustum.I still need rotation.
The camera frustum in the camera frame should have no rotation applied to it, since it's rigidly attached to the camera frame. So I think this should actually be easier than you'd expect.
Here's an approach that looks correct to me:
This also seems correct with a more common (in computer vision) intrinsics matrix:
This also seems correct with a more common (in computer vision) intrinsics matrix:
That's right, this method can determine whether a point is in the frustum, which is used in the 3DGS code
Chatgpt is very clever. I found the missing parameter (up-direction), which allows me to locate the only frustum.
code:
def calculate_frustum_corners(eye, direction, up, fov, aspect_ratio, near, far):
#
eye = np.array(eye)
#
direction = np.array(direction) / np.linalg.norm(direction)
#
up = np.array(up) / np.linalg.norm(up)
#
right = np.cross(direction, up)
right = right / np.linalg.norm(right)
#
up = np.cross(right, direction)
tan_fov = np.tan(np.radians(fov / 2))
near_height = 2 * tan_fov * near
near_width = near_height * aspect_ratio
far_height = 2 * tan_fov * far
far_width = far_height * aspect_ratio
near_center = eye + direction * near
far_center = eye + direction * far
corners = {}
corners['Near Top Left'] = near_center + (up * (near_height / 2)) - (right * (near_width / 2))
corners['Near Top Right'] = near_center + (up * (near_height / 2)) + (right * (near_width / 2))
corners['Near Bottom Left'] = near_center - (up * (near_height / 2)) - (right * (near_width / 2))
corners['Near Bottom Right'] = near_center - (up * (near_height / 2)) + (right * (near_width / 2))
corners['Far Top Left'] = far_center + (up * (far_height / 2)) - (right * (far_width / 2))
corners['Far Top Right'] = far_center + (up * (far_height / 2)) + (right * (far_width / 2))
corners['Far Bottom Left'] = far_center - (up * (far_height / 2)) - (right * (far_width / 2))
corners['Far Bottom Right'] = far_center - (up * (far_height / 2)) + (right * (far_width / 2))
return corners
Is there a way to set the near and far clipping planes of the camera in a viser scene?
Is there a way to set the near and far clipping planes of the camera in a viser scene?
Hi, the camera handle is a part of the "client", thus you have to retrieve the clients of the session and set the near and far clippings through it.
Hi! Yes, sorry - I think I added this comment before filing issue #355, which was immediately fixed with PR #356 that added client.camera.near and client.camera.far. Forgot to update this thread as well!
