IRS icon indicating copy to clipboard operation
IRS copied to clipboard

camera pose convention

Open za-cheng opened this issue 2 years ago • 5 comments

Hi there,

First huge thanks for publishing this dataset. I'm looking to use your dataset for MVS but struggle with the camera pose convention in Trace.txt. I wonder if you could provide more explanation please.

Are these camera-to-world or world-to-camera matrices, and how is the camera coordinate system defined (i.e. what are the +x,+y,+z directions - I assumed +x is right, +y down and +z forward but apparently that's not the case)? I also notice the matrix has translation on last row, instead of last column in MVS convention, should I transpose rotation matrix as well?

Cheers, Z

za-cheng avatar Sep 20 '22 00:09 za-cheng

Hi, za-cheng,

Thanks for your interest of IRS.

  1. the matrices are camera-to-world;
  2. for the directions, +x is forward, +y right and +z up, which follows the standard of UE4. It is also called left-system.

We also notice your excellent work on Siggraph 2022 - "Diffeomorphic Neural Surface Parameterization for 3D and Reflectance Acquisition". Hope we can have a chance of cooperation. Cheers!

Best regards, Qiang Wang

blackjack2015 avatar Sep 21 '22 03:09 blackjack2015

Thanks for the valuable discussion here. Can I know how to get the camera-to-world matrix from the Trace.txt. As the question by @za-cheng, "I also notice the matrix has translation on last row, instead of last column in MVS convention, should I transpose rotation matrix as well?"

For example: in a Trace.txt file, we can see:

0.09689467371068361 0.9952946409010922 -2.565641006486985e-07 0.0
-0.9951432653217077 0.09687994137690414 0.017440138865749036 0.0
0.0173581016055659 -0.0016896012468288792 0.999847909212335 0.0
-6.26467896 -20.06545654 1.50066193 1.0

How to get the regular "row-major" camera-to-world matrix? What it should be given the 4 lines above?

Thanks!

ccj5351 avatar Mar 04 '24 00:03 ccj5351

Finally, I got the camera-to-world pose in the OpenCV style coordinate system.

Please see my code on how to generate the camera pose from the "*/UE_Trace.txt" file (e.g., */IRS/Auxiliary/CameraPos/Restaurant/DinerEnvironment_Dark/UE_Trace.txt).

As for the transformation matrix from Unreal Engine (x Forward, y Right, z Up) to OpenCV-style (x Right, y Down, z Forward) coordinates:

You can check the details in Chapter 2.2 of the book John J. Craig, Introduction to Robotics: Mechanics and Control, Third Edition (2005) , see the screenshot below:

image

This way, we can get the matrix from Unreal Engine to OpenCV-style as:

T = np.array([
                  [0,1,0,0],
                  [0,0,-1,0],
                  [1,0,0,0],
                  [0,0,0,1]], dtype=np.float32)
    T_wue_2_w = T
    # Similarly, we can find the transformation from cue to c;
    T_cue_2_c = T
    T_c_2_cnet = np.linalg.inv(T_cue_2_c)

The generated camera poses have been verified by depth warping among multi-view images:

You can find the pixel highlighted by a red circle is visually correctly warped into another view highlighted by a green circle.

image

ccj5351 avatar Mar 16 '24 05:03 ccj5351

Finally, I got the camera-to-world pose in the OpenCV style coordinate system.

Please see my code on how to generate the camera pose from the "*/UE_Trace.txt" file (e.g., */IRS/Auxiliary/CameraPos/Restaurant/DinerEnvironment_Dark/UE_Trace.txt).

As for the transformation matrix from Unreal Engine (x Forward, y Right, z Up) to OpenCV-style (x Right, y Down, z Forward) coordinates:

You can check the details in Chapter 2.2 of the book John J. Craig, Introduction to Robotics: Mechanics and Control, Third Edition (2005) , see the screenshot below:

image This way, we can get the matrix from Unreal Engine to OpenCV-style as:
T = np.array([
                  [0,1,0,0],
                  [0,0,-1,0],
                  [1,0,0,0],
                  [0,0,0,1]], dtype=np.float32)
    T_wue_2_w = T
    # Similarly, we can find the transformation from cue to c;
    T_cue_2_c = T
    T_c_2_cnet = np.linalg.inv(T_cue_2_c)

The generated camera poses have been verified by depth warping among multi-view images:

You can find the pixel highlighted by a red circle is visually correctly warped into another view highlighted by a green circle.

image

Excellent! Would you mind proposing a pull request to help us refine the project? Thank you very much!

Best regards, Qiang Wang

blackjack2015 avatar Mar 17 '24 15:03 blackjack2015

Sure. My pleasure. Just made the pull request. Thanks!

ccj5351 avatar Mar 17 '24 23:03 ccj5351