ARHeadsetKit icon indicating copy to clipboard operation
ARHeadsetKit copied to clipboard

API for exporting mesh and color data

Open philipturner opened this issue 1 year ago • 3 comments

I am working on an API to export the raw data from ARHeadsetKit. It takes all geometry and color data present at one point in time, then reorders it into a simple serialization format. This is entirely lossless and should happen very quickly.

  • To work with this format, you need a programming language with low-level access to memory layout. Python is not ideal; rather, use Swift or C/C++ compiled in release mode.

The next step is converting the YCbCr data to RGB (if you wish) or rearranging the geometry data into a different layout.

  • Initially, you should convert the luma and chroma into PNG images, before converting to RGB. They are valid color spaces that can be viewed as images.
    • This can be done for a single triangle (a very tiny image).
  • Next, read this: https://developer.apple.com/documentation/accelerate/conversion/understanding_ypcbcr_image_formats
  • Apple has utilities in vImage for converting YCbCr -> RGB on the CPU: https://developer.apple.com/documentation/accelerate/1533189-vimageconvert_ypcbcrtoargb_gener
    • Concatenate all the chroma tiles into a massive array, do the same with luma, and send that through vImage.

cc @knightfork

philipturner avatar Jul 01 '23 20:07 philipturner

Serialization format:

64 bytes {
  UInt32 - number of small triangles
  UInt32 - number of large triangles
  56 bytes of padding
}
repeat for number of small triangles {
  320 bytes {
    [SIMD4<Float>](count: 3) - vertices; SIMD4<Float>(x, y, z, 0)
    [SIMD2<Float>](count: 3) - texture coordinates; SIMD2<Float>(u, v)
    32 bytes of padding
    [SIMD2<UInt8>](count: 36) - row-major 6x6 matrix of chroma
    [SIMD<UInt8>](count: 144) - row-major 12x12 matrix of luma
  }
}
repeat for number of large triangles {
  1280 bytes {
    [SIMD4<Float>](count: 3) - vertices; SIMD4<Float>(x, y, z, 0)
    [SIMD2<Float>](count: 3) - texture coordinates; SIMD2<Float>(u, v)
    32 bytes of padding
    [SIMD2<UInt8>](count: 196) - row-major 14x14 matrix of chroma
    [SIMD<UInt8>](count: 784) - row-major 28x28 matrix of luma
  }
}

Texture coordinates are local coordinates within the 12x12 or 28x28 tile. They start at (0, 0) at the bottom left corner of the first pixel. The coordinates increase by 1 for each pixel traveled right or up.

This should ZIP compress quite nicely, as over 50% of the pixels are zeroed out. Per 100K triangles, you can expect 50 MB uncompressed and 25 MB compressed.

philipturner avatar Jul 01 '23 23:07 philipturner

#92 shows how to use the exporter API.

This is ready for testing inside an ARHK tutorial. I won’t proactively test it, but I’ll fix any bugs you report.

philipturner avatar Jul 01 '23 23:07 philipturner

I corrected a previous error in the serialization format specification. It previously had 14x14 luma for the 320-byte tiles and 12x12 chroma for the 1280-byte. That should be reversed; 12x12 for 320-byte and 14x14 for 1280-byte.

philipturner avatar Jul 18 '23 13:07 philipturner