ObjectCapture heic rotation appears to be a 180-degree rotation around the X-axis
I'm trying to extract the transform from a HEIC file generated by iPhone's ObjectCapture. I take a photo with the camera nearly at the origin and with no rotation, using the following steps:
- Quit RealityComposer.
- Hold the iPhone upright in portrait orientation, keeping it perfectly still from this point on.
- Launch RealityComposer.
- Start a capture with ObjectCapture and take the first photo.
- Stop the capture and transfer the data to a Mac via AirDrop.
Right-click the file, open the package contents, and select the first HEIC file. I believe this HEIC file is located near the origin of ARKit's coordinate system and has a transform with little to no rotation.
The position is nearly zero, which is fine. However, the rotation appears to be a 180-degree rotation around the X-axis. The camera's forward direction in camera coordinates is [0, 0, -1]. I applied the translation and rotation matrices to plot this in world coordinates. I expected the resulting vector to point forward, but it ended up pointing straight backward. When I checked the rotation values using examples/heif-info, I got the following:
properties:
camera intrinsic matrix:
focal length: 3022.019286; 3022.019286
principal point: 1518.492311; 2005.849608
skew: 0.000000
camera extrinsic matrix:
rotation matrix:
0.999 -0.025 0.040
-0.017 -0.982 -0.188
0.044 0.187 -0.981
Since the camera was barely moved, I expected the rotation matrix to be close to the identity matrix, but the Y and Z components are close to -1.
Is this behavior correct, or is something going wrong?
I have no way to check this because I have no iPhone. As far as I remember, I tested this against some conformance images and the decoding was ok.
The rotation matrix is either stored as three rotation angles or as a quaternion. The rotation matrix is computed in libheif in this function: https://github.com/strukturag/libheif/blob/d84f58fe0af319f01ec2fd1739873f10400253b5/libheif/box.cc#L4497-L4554
You may check the raw data from the image with heif-info -d ....
Look for the cmex box. There you can find the raw data as stored in the file. Please copy-paste that box's dump here.
Thanks for the advice.
| | | index: 6
| | | Box: 4363e914-5b7d-4aab-97ae-bea69803b434 ----- (Camera Extrinsic Matrix)
| | | size: 56 (header size: 28)
| | | camera position (um): -59918 ; -16182 ; -69305
| | | orientation (quaterion)
| | | q = [0.995296;-0.0105958;0.0209387;0.0939975]
| | | world coordinate system id: 0
| | |
| | | index: 7
| | | Box: 22cc04c7-d6d9-4e07-9d90-4eb6ecbaf3a3 ----- (Camera Intrinsic Matrix)
| | | size: 40 (header size: 28)
| | | principal-point: 0.37661, 0.66331
| | | focal-length: 0.749509
| | | no skew
(ChatGPT)
This quaternion encodes an ~11° rotation in ARKit’s right-handed (X→right, Y→up, Z→back) coordinate system:
Roll (X-axis): –1°
Pitch (Y-axis): +2.5°
Yaw (Z-axis): +10.8°
Or as an axis–angle: rotate ~11° about the axis ≈(–0.11, 0.22, 0.97).
The quaternion seems to be oriented almost forward.
It seems to have something to do with rotation, cropping, and flipping. I'm not even sure if the iPhone output is correct.
I will take other pictures and look into it.
The image and the entire output are also attached. 00008.167612625.HEIC.zip
I copy pasted the quaternion into this online converter: https://www.andre-gaschler.com/rotationconverter/ and got the same rotation matrix as libheif:
I found that swapping xyzw and wxyz in a zero-rotation quaternion causes a 180° rotation about the X-axis. I’ll continue investigating.
There are two conventions for quaternions: Hamilton and JPL. The right answer depends on the convention. A 180 rotation could be associated with using the wrong convention (there are other reasons, like difference in reference frame, or whether the frame is rotated or the object is rotated).
https://github.com/strukturag/libheif/blob/e0bfb132ab984ad3d7703b5183a9fb336d96f8a0/libheif/context.cc#L638
This could be the cause.
Might be. There is some discussion about cmex vs. irot here: https://github.com/MPEGGroup/FileFormat/issues/102
In my view this is ill-defined or I at least don't see how to interpret it correctly. If you have any insights, let me know.
On the other hand, an irot of 270 degrees also does not explain a rotation by 180 degrees in the cmex.
This is the capture data from rotating halfway around the table. The camera is tilted diagonally downward from chest height.
You can check the captured photos and plots in this notebook:
https://colab.research.google.com/drive/1IgTUDrqUG1Awnq7uOdMpTczkNoaaFEii
If you apply the obtained rotation as-is, the light-blue camera’s viewing direction points outward.
Rotating this around the X-axis by 180 degrees produces a plot that looks correct.
テーブルの上を半周まわった撮影データです。 胸の高さからななめ下を向けています。
このnotebookで撮影写真とplotが確認できます。
https://colab.research.google.com/drive/1IgTUDrqUG1Awnq7uOdMpTczkNoaaFEii
取得できる回転をそのままつかうと水色のカメラの視線方向が外側を向いています。
これをX軸周りに180度回転させると、正しそうなplotになります。