Negative values in the intrinsics matrix generated in the MOVI datasets
Hello,
I've been playing around with the Movi dataset, and I found something odd about the intrinsics matrix:
K = [[1.09375, 0.0, -0.5],
[0.0, -1.09375, -0.5],
[0.0, 0.0, -1.0]]
AFAIK the intrinsics matrix should have the form of
K = [[fx, 0.0, -cx],
[0.0, fy, -cy],
[0.0, 0.0, 1.0]]
where fx, fy, cx, cy > 0. What does it mean when the intrinsics matrix has negative values i.e. entries K_11 = -1.09375 and K_22 = -1.0?
Thank you in advance!
@andrewsonga Hi, I encountered the same issue when projecting the world coordinate system into the camera coordinate system. Have you found a solution yet?
Hi, @andrewsonga @zhangzjjjjjj
While I'm not one of the official authors, I've looked into this and can explain it.
The short answer is that the negative values in the intrinsics matrix are a result of Kubric using a camera coordinate system as OpenGL, rather than the one typically used in OpenCV.
Here is a more detailed breakdown:
1. The OpenCV Camera Convention (The Baseline)
In the standard OpenCV convention, the camera coordinate system is defined as:
- +X axis points to the right.
- +Y axis points down.
- +Z axis points forward (into the scene).
The projection equation from camera coordinates $(X, Y, Z)$ to pixel coordinates $(u, v)$ is:
$$ Z\begin{pmatrix} u \ v \ 1 \end{pmatrix} = \mathbf{K} \begin{pmatrix} X \ Y \ Z \end{pmatrix} = \begin{pmatrix} f_x & 0 & c_x \ 0 & f_y & c_y \ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} X \ Y \ Z \end{pmatrix} $$
Here, all parameters ($f_x, f_y, c_x, c_y$) are positive.
2. The Kubric/OpenGL Camera Convention
In Kubric (and other graphics applications like Blender/OpenGL), the camera coordinate system is different:
- +X axis points to the right.
- +Y axis points up.
- +Z axis points backward (out of the screen, toward the camera).
This means the Y and Z axes are inverted compared to the OpenCV convention.
Let's look at the code:
https://github.com/google-research/kubric/blob/4d5a0d4ee80cac1c318f58bed83db284f6c70036/challenges/point_tracking/dataset.py#L248-L253
This corresponds to the matrix:
$$ \mathbf{K}_{kubric} = \begin{pmatrix} f_x & 0 & -c_x \ 0 & -f_y & -c_y \ 0 & 0 & -1 \end{pmatrix} $$
This directly explains what you observed:
K[:, 1]is negative (-f_y): This accounts for the flipped Y-axis (up vs. down).K[:, 2]is negative (-c_x,-c_y,-1): This accounts for the flipped Z-axis (forward vs. backward depth).
3. How to Convert to the OpenCV Standard in Your Code
If you want to use a standard OpenCV camera model in your own pipeline, you must convert both the intrinsic and extrinsic (pose) matrices.
Modify the Intrinsics Matrix to the OpenCV Standard: As you suggested, change the intrinsics definition to:
intrinsics.append(
tf.stack([
tf.stack([f_x, 0., p_x]),
tf.stack([0., f_y, p_y]),
tf.stack([0., 0., 1.]),
])
)
Convert the Camera Pose (Extrinsics) from OpenGL to OpenCV:
After creating the standard intrinsics, you must also convert the camera pose (matrix_world). You do this by applying a transformation matrix that flips the Y and Z axes of the pose.
Original version:
https://github.com/google-research/kubric/blob/4d5a0d4ee80cac1c318f58bed83db284f6c70036/challenges/point_tracking/dataset.py#L255-L268
Modification:
position = cam_positions[frame_idx]
quat = cam_quaternions[frame_idx]
rotation_matrix = rotation_matrix_3d.from_quaternion(
tf.concat([quat[1:], quat[0:1]], axis=0)
)
transformation = tf.concat(
[rotation_matrix, position[:, tf.newaxis]],
axis=1,
)
transformation = tf.concat(
[transformation,
tf.constant([0.0, 0.0, 0.0, 1.0])[tf.newaxis, :]],
axis=0,
)
# ADD THIS: Convert the camera pose from OpenGL-style to OpenCV-style
cv_from_gl_transform = tf.constant([
[1, 0, 0, 0],
[0, -1, 0, 0],
[0, 0, -1, 0],
[0, 0, 0, 1]
], dtype=tf.float32)
transformation = tf.matmul(transformation, cv_from_gl_transform)
matrix_world.append(transformation)
By applying both of these changes, your entire pipeline will correctly operate under the standard OpenCV camera model.
Hope this clears things up for you!
Best,