libheif icon indicating copy to clipboard operation
libheif copied to clipboard

ObjectCapture heic rotation appears to be a 180-degree rotation around the X-axis

Open takeru opened this issue 7 months ago • 10 comments

I'm trying to extract the transform from a HEIC file generated by iPhone's ObjectCapture. I take a photo with the camera nearly at the origin and with no rotation, using the following steps:

  • Quit RealityComposer.
  • Hold the iPhone upright in portrait orientation, keeping it perfectly still from this point on.
  • Launch RealityComposer.
  • Start a capture with ObjectCapture and take the first photo.
  • Stop the capture and transfer the data to a Mac via AirDrop.

Right-click the file, open the package contents, and select the first HEIC file. I believe this HEIC file is located near the origin of ARKit's coordinate system and has a transform with little to no rotation.

The position is nearly zero, which is fine. However, the rotation appears to be a 180-degree rotation around the X-axis. The camera's forward direction in camera coordinates is [0, 0, -1]. I applied the translation and rotation matrices to plot this in world coordinates. I expected the resulting vector to point forward, but it ended up pointing straight backward. When I checked the rotation values using examples/heif-info, I got the following:

properties:
  camera intrinsic matrix:
    focal length: 3022.019286; 3022.019286
    principal point: 1518.492311; 2005.849608
    skew: 0.000000
  camera extrinsic matrix:
    rotation matrix:
       0.999 -0.025  0.040
      -0.017 -0.982 -0.188
       0.044  0.187 -0.981

Since the camera was barely moved, I expected the rotation matrix to be close to the identity matrix, but the Y and Z components are close to -1.

Is this behavior correct, or is something going wrong?

takeru avatar May 22 '25 14:05 takeru

I have no way to check this because I have no iPhone. As far as I remember, I tested this against some conformance images and the decoding was ok.

The rotation matrix is either stored as three rotation angles or as a quaternion. The rotation matrix is computed in libheif in this function: https://github.com/strukturag/libheif/blob/d84f58fe0af319f01ec2fd1739873f10400253b5/libheif/box.cc#L4497-L4554

You may check the raw data from the image with heif-info -d .... Look for the cmex box. There you can find the raw data as stored in the file. Please copy-paste that box's dump here.

farindk avatar May 22 '25 15:05 farindk

Thanks for the advice.

| | | index: 6
| | | Box: 4363e914-5b7d-4aab-97ae-bea69803b434 ----- (Camera Extrinsic Matrix)
| | | size: 56   (header size: 28)
| | | camera position (um): -59918 ; -16182 ; -69305
| | | orientation (quaterion)
| | |   q = [0.995296;-0.0105958;0.0209387;0.0939975]
| | | world coordinate system id: 0
| | | 
| | | index: 7
| | | Box: 22cc04c7-d6d9-4e07-9d90-4eb6ecbaf3a3 ----- (Camera Intrinsic Matrix)
| | | size: 40   (header size: 28)
| | | principal-point: 0.37661, 0.66331
| | | focal-length: 0.749509
| | | no skew
(ChatGPT)

This quaternion encodes an ~11° rotation in ARKit’s right-handed (X→right, Y→up, Z→back) coordinate system:

Roll (X-axis): –1°

Pitch (Y-axis): +2.5°

Yaw (Z-axis): +10.8°

Or as an axis–angle: rotate ~11° about the axis ≈(–0.11, 0.22, 0.97).

The quaternion seems to be oriented almost forward.

It seems to have something to do with rotation, cropping, and flipping. I'm not even sure if the iPhone output is correct.

I will take other pictures and look into it.

The image and the entire output are also attached. 00008.167612625.HEIC.zip

00008-heif-info-d.txt

takeru avatar May 22 '25 22:05 takeru

I copy pasted the quaternion into this online converter: https://www.andre-gaschler.com/rotationconverter/ and got the same rotation matrix as libheif:

Image

farindk avatar May 22 '25 22:05 farindk

I found that swapping xyzw and wxyz in a zero-rotation quaternion causes a 180° rotation about the X-axis. I’ll continue investigating.

takeru avatar May 22 '25 23:05 takeru

There are two conventions for quaternions: Hamilton and JPL. The right answer depends on the convention. A 180 rotation could be associated with using the wrong convention (there are other reasons, like difference in reference frame, or whether the frame is rotated or the object is rotated).

bradh avatar May 23 '25 09:05 bradh

https://github.com/strukturag/libheif/blob/e0bfb132ab984ad3d7703b5183a9fb336d96f8a0/libheif/context.cc#L638

This could be the cause.

takeru avatar May 23 '25 14:05 takeru

Might be. There is some discussion about cmex vs. irot here: https://github.com/MPEGGroup/FileFormat/issues/102 In my view this is ill-defined or I at least don't see how to interpret it correctly. If you have any insights, let me know.

farindk avatar May 23 '25 14:05 farindk

On the other hand, an irot of 270 degrees also does not explain a rotation by 180 degrees in the cmex.

farindk avatar May 23 '25 14:05 farindk

This is the capture data from rotating halfway around the table. The camera is tilted diagonally downward from chest height.

You can check the captured photos and plots in this notebook:

https://colab.research.google.com/drive/1IgTUDrqUG1Awnq7uOdMpTczkNoaaFEii

If you apply the obtained rotation as-is, the light-blue camera’s viewing direction points outward.

Image

Image

Rotating this around the X-axis by 180 degrees produces a plot that looks correct.

Image

Image


テーブルの上を半周まわった撮影データです。 胸の高さからななめ下を向けています。

このnotebookで撮影写真とplotが確認できます。

https://colab.research.google.com/drive/1IgTUDrqUG1Awnq7uOdMpTczkNoaaFEii

取得できる回転をそのままつかうと水色のカメラの視線方向が外側を向いています。

Image

Image

これをX軸周りに180度回転させると、正しそうなplotになります。

Image

Image

takeru avatar May 27 '25 23:05 takeru