taichi [RFC][AOT] C-API for Textures

Texture support is recently added to Taichi and the corresponsing C-APIs are here provided in #5520 . This issue intends to discuss the design and logstics of these interfaces. I won't dive into every aspect of the design but the most confusing and critical parts will be covered.

This issue is open to discuss, including the aspects not covered by the following text. Please feel free to leave your comments here. 🤗

Explicit Dimension

The proposed interface requires explicitly specified texture dimensions. Zero in image extent and array layer counts is not allowed.

// enumeration.texture_dimension
typedef enum TiTextureDimension {
  TI_TEXTURE_DIMENSION_1D = 1,
  TI_TEXTURE_DIMENSION_2D = 2,
  TI_TEXTURE_DIMENSION_3D = 3,
  TI_TEXTURE_DIMENSION_1DARRAY = 4,
  TI_TEXTURE_DIMENSION_2DARRAY = 5,
  TI_TEXTURE_DIMENSION_CUBE = 6,
  TI_TEXTURE_DIMENSION_MAX_ENUM = 0xffffffff,
} TiTextureDimension;

It might be okay to allow the user to specify image dimensions implicitly with zeros in image extents, but it's not compatible with cube maps. Note that although cube maps is convertible to 2D texture arrays in Vulkan and DirectX11, Metal doesn't support such usage. Also, 3D textures cannot be arrayed so explicit typing should be important.

Explicit Format

Texture C-API introduced TiTextureFormat for the user to specity texel (pixel) formats.

// enumeration.texture_format
typedef enum TiTextureFormat {
  TI_TEXTURE_FORMAT_R8 = 0,
  TI_TEXTURE_FORMAT_R8G8 = 1,
  TI_TEXTURE_FORMAT_R8G8B8A8 = 2,
  TI_TEXTURE_FORMAT_R10G10B10A2 = 3,
  TI_TEXTURE_FORMAT_R16 = 4,
  TI_TEXTURE_FORMAT_R16G16 = 5,
  TI_TEXTURE_FORMAT_R16G16B16A16 = 6,
  TI_TEXTURE_FORMAT_R11G11B10F = 7,
  TI_TEXTURE_FORMAT_R16F = 8,
  TI_TEXTURE_FORMAT_R16G16F = 9,
  TI_TEXTURE_FORMAT_R16G16B16A16F = 10,
  TI_TEXTURE_FORMAT_R32F = 11,
  TI_TEXTURE_FORMAT_R32G32F = 12,
  TI_TEXTURE_FORMAT_R32G32B32A32F = 13,
  TI_TEXTURE_FORMAT_MAX_ENUM = 0xffffffff,
} TiTextureFormat;

The list formats are ubiquitously supported by modern graphics devices from desktop GPUs to mobile ones, for both reading and writing. It cast constraints on the users so they wouldn't be confused by any unusual availability on a specific device in deployment.

It is fine to compose texel formats simply from component_types and component_counts but there exists cases such API could fail. For example, R10G10B10A2 is a unsigned quantized normalized number that has 10 bits for the RGB components and 2 bits for the alpha component, all packed in a 32-bit word. It is a common data type for wide-gamut color representation (like Display P3). R11G11B10F has two 11-bit small floating-point RG components and a 10-bit B component, all packed in a 32-bit word. It's usually used to represent HDR contents and intermediate render targets in game engines, without doubling memory allocation (with R16G16B16A16F). I think these use cases are common enough for us to provide extra complication in texture format specification for them.

Vulkan Interoperation

In Vulkan specifically, image objects have a dynamic property called image layout. Image layouts represents how the image content is represented physically in the bound memory, and it might change during pipeline execution. So if VkImageLayout is queried from a ti_{import|export}_vulkan_texture, the layout might already have changed when the users submit their own command buffer. The user too need a method to notify Taichi about any change of exported images.

// function.get_vulkan_texture_layout
TI_DLL_EXPORT VkImageLayout TI_API_CALL
ti_get_vulkan_texture_layout(TiRuntime runtime, TiTexture texture);

// function.set_vulkan_texture_layout
TI_DLL_EXPORT void TI_API_CALL
ti_set_vulkan_texture_layout(TiRuntime runtime,
                             TiTexture texture,
                             VkImageLayout layout);

The current solution is to provide a dedicated set of API (ti_{get|set}_vulkan_texture_layout), but I wonder if it would be better to allow the users to issue queries via unified interfaces like ti_{get|set}_texture_property with a string or enum property name?

Jul 27 '22 06:07 PENGUINLIONG

I wonder if it would be better to allow the users to issue queries via unified interfaces like ti_{get|set}_texture_property with a string or enum property name?

From the RHI perspective, this does seem like a more unified way? Thoughts @bobcao3 ?

Jul 28 '22 01:07 k-ye

For layouts, the user need to assume all images coming in are in "undefined" layout, and an appropriate layout transition should be done by the user. There is no way to query or set the layout because it's a GPU state depending on the execution of command buffers. They are not an attribute always assigned to an image, an image can and will need to change layout when the user wants to use it.

short answer, no we shouldn't provide ti_{get|set}_vulkan_texture_layout

Jul 30 '22 22:07 bobcao3

Also for the enums, please follow what's already there in device API (device.h) It's very well defined already, and I would suggest adding the entire device API to C-API

Jul 30 '22 22:07 bobcao3

Thx for the comments! :)

For layouts, the user need to assume all images coming in are in "undefined" layout, and an appropriate layout transition should be done by the user. There is no way to query or set the layout because it's a GPU state depending on the execution of command buffers. They are not an attribute always assigned to an image, an image can and will need to change layout when the user wants to use it.

short answer, no we shouldn't provide ti_{get|set}_vulkan_texture_layout

I see. So to my understanding, there shall be no assumption on the exsiting data of the imported images? I would alternatively add some checks to ensure that there is no read-access to those imported textures, so that the users could be aware of the write-only nature of them.

But if importing readonly textures is not allowed, and DX11 doesn't support buffer-texture copy, I'm considering whether we should offer interfaces like ti_allocate_texture_with_initial_data.

Also for the enums, please follow what's already there in device API (device.h) It's very well defined already, and I would suggest adding the entire device API to C-API

For this one I can extend those enums and move them to the incs.

Jul 31 '22 03:07 PENGUINLIONG

I just realized that our implementation doesn't trace image layout yet, probably for a same philosophy @bobcao3 has discussed.

https://github.com/taichi-dev/taichi/blob/fc3af55e56a17c9e9b770cefd33e7389c44ac0aa/taichi/runtime/gfx/runtime.cpp#L498-L509

According to the specification, transitioning from undefined layout might invalidate its existing content.

VK_IMAGE_LAYOUT_UNDEFINED specifies that the layout is unknown. Image memory cannot be transitioned into this layout. This layout can be used as the initialLayout member of VkImageCreateInfo. This layout can be used in place of the current image layout in a layout transition, but doing so will cause the contents of the image’s memory to be undefined.

At least to be comformant with the specification, we would have to trace the image layout state if it's a part of the computation rather than a mere readonly resource (whose layout never mutates). Taichi follows a linear execution model in which the first invoked launch always precedes a later invocation, so if the image lives only within Taichi runtime, there shall be no problem for us to trace it.

Then it comes back to the same problem that how the image layout should be served if the external procedure have already loaded or generated content in textures for Taichi kernels to consume, or the opposite. But before further discussion let me try it out with Unity as an example.

Aug 01 '22 06:08 PENGUINLIONG