OpenCL-Docs icon indicating copy to clipboard operation
OpenCL-Docs copied to clipboard

WIP cl_ext_image_tiling_control

Open kpet opened this issue 4 years ago • 9 comments

Open topics:

  • [ ] Interaction with YUV images as defined by https://github.com/KhronosGroup/OpenCL-Docs/pull/722
  • [ ] Interaction with images created from a buffer
    • [ ] Support tilings beyond linear
  • [ ] Interaction with external memory (role of row_pitch and slice_pitch)
    • [ ] What tiling to return when linear tiling is not assumed?
    • [ ] Interaction with layout inference extensions
  • [ ] Ability to create images with non-linear tiling and CL_MEM_USE_HOST_PTR (see https://github.com/KhronosGroup/OpenCL-Docs/pull/710#discussion_r990563761)
  • [ ] Ability to map the image without de-tiling when the image uses a non-linear tiling (Qcom + https://github.com/KhronosGroup/OpenCL-Docs/pull/710#discussion_r990565573)
  • [ ] DEFAULT implementation-defined tiling (see https://github.com/KhronosGroup/OpenCL-Docs/pull/710#discussion_r990565267 and https://github.com/KhronosGroup/OpenCL-Docs/pull/710#discussion_r1060647302)
  • [ ] Many tiling schemes use fixed-size blocks. Introduce the concept of block to the specification and define
    • Rules for mapping/unmapping, including overlap rules. Can a region that starts within a block be mapped?
    • Do we need queries to report per-format and tiling scheme block properties (sizes, etc)
    • Probably more

Change-Id: I6c391b5ecaa203f7db566db68a7e3d124d6038a2 Signed-off-by: Kevin Petit [email protected]

kpet avatar Nov 10 '21 17:11 kpet

Discussed in the October 4th teleconference:

  • Do we need to define and document any interactions with YUV extensions (#722)?
  • Are there any interactions to consider when creating an image from a buffer, or from external memory?
  • It would be helpful to clearly describe the parts of this extension that provide additional functional guarantees beyond "performance hints".

bashbaug avatar Oct 08 '22 00:10 bashbaug

Some initial thoughts

  1. It should be possible for (non-linear) tiled images to be created with CL_MEM_USE_HOST_PTR. I believe the spec already allows for this implicitly.
  2. It would be helpful to allow tiled images to be created with a flag that permits host access but does not detile the image for host access. Therefore EnqueueMapImage would return a host pointer to the tiled image data. This could be useful in cases where the CPU has to transfer image data between the GPU and another IP block or the GPU and storage.
  3. As mentioned earlier we should clarify the interaction of this extension (specifically tiled images) with ext_image_from_buffer and external_memory_import with attention given to the role of row_pitch and slice_pitch.

Thanks, Balaji Calidas Qualcomm

bcalidas avatar Oct 08 '22 00:10 bcalidas

I wanted to leave some high-level thoughts about how this extension could work.

The major impact of linear vs optimal tiling modes would be when the image is created from a buffer or from external memory. In these cases, if the image has optimal tiling mode, then the application should be required to set row_pitch and/or slice_pitch to zero.

Applications should be able to choose whether an optimally tiled image should present a linear view when mapped for host CPU access. Currently, implementations are implicitly required to present a linear view of image data when mapped for host access. This extension could possibly give implementations the ability to opt out of providing that linear view. How the image data is finally presented when mapped for host access would then depend on both implementation capabilities and application selection.

Ben has suggested that applications make the linear view vs. raw view selection as part of the EnqueueMap command. This could also be specified when creating the image. We'll need to discussion these options further.

Thanks, Balaji

bcalidas avatar Oct 16 '22 04:10 bcalidas

I wanted to leave some high-level thoughts about how this extension could work....

I think the comment above describes one way this extension could work, but not necessarily the only way.

IMHO the key value this extension provides is a hint (or assertion?) to the driver that the image should (or must?) be stored internally in a specific format - either "linear" (well-specified) or "tiled" (implementation-defined, in this extension, at least). This will primarily change the performance characteristics of an image, especially when it is mapped, though likely also when it is accessed within a kernel.

Once we have the ability to control or otherwise influence the internal layout it gives us the ability to add additional functionality, either in this extension or in other layered extensions. For example, we could:

  1. Influence or mandate an internal layout when creating an image from a buffer or from an external memory handle.
  2. Map the image but provide direct access to image data in the internal layout vs. converting to a linear layout.

One of the key questions I think we'll need to answer is whether this extension is purely a performance hint or whether it is something stronger. For the base-level functionality I think we could go either way, but for the additional functionality (external memory and direct access mapping) I think we'd want stronger guarantees (at least in some cases).

bashbaug avatar Oct 17 '22 18:10 bashbaug

I posted some comments on https://github.com/KhronosGroup/OpenCL-Docs/issues/861 which include updates to cl_ext_image_tiling_control. These comments also discuss how cl_ext_image_tiling_control could bring clarity to external memory and image_from_buffer usage.

Thanks, Balaji Calidas Qualcomm

bcalidas avatar Jan 09 '23 05:01 bcalidas

Following up on the discussion around #861 from Jan 10, 2023 We'd like to propose the following outline for his extension.

First we recommend that this extension be made khr since it is foundational to other extensions including external memory. Image_tiling_control defines 2 new properties for image tiling.

CL_IMAGE_TILING_LINEAR_KHR and CL_IMAGE_TILING_OPTIMAL_KHR

These properties can be used when calling clCreateImageWithProperties and bring clarity to the image layout when used.

Optimal tiling can be thought of as a superset of linear tiling. What it does it to let implementations decide how the image should be tiled. An implementation can continue to use linear tiling even if the application selects CL_IMAGE_TILING_OPTIMAL_KHR With that in mind, we believe all implementations should be able to support both linear and optimal tiling.

We also recommend that all implementations be required to support device access to optimally tiled and linear images. It seems that device access is a minimum for the image to be useful. Regarding host access, we propose that the default behavior be to present a linear view of the image data when clEnqueueMapBuffer is called. This means that implementations will need to detile optimally tiled images ( if needed ) when clEnqueueMapBuffer is called. If this is problematic for some vendors, we can add a cl_device_info param that lets the application know if the implementation can detile optimally tiled images when clEnqueueMapBuffer is called.

For many use cases having direct access to tiled image data is useful. For this reason, we are proposing a new flag CL_MAP_IMAGE_NO_DETILE_KHR. When used in clEnqueueMapImage, the implementation will not detile the data for host access.

Thanks, Balaji Calidas Qualcomm

bcalidas avatar Jan 15 '23 01:01 bcalidas

I've rebased the specification draft and captured what I believe to be an exhaustive list of all the open topics in the PR description. We seem to be mostly aligned on where this extension should go.

kpet avatar Jan 17 '23 15:01 kpet

We uploaded an update draft that has the following major modifications.

  1. Implementations supporting this extension will support both linear and optimal image layouts. Optimal layouts can be mapped to linear by an implementation.
  2. Allows for per image query of possible tiling modes, actual tiling mode and behavior of EnqueueMapImage.
  3. By default EnqueueMapImage will present a linear view of the image data. However, for some images ( likely extension images as opposed to core image formats ) a linear view will not be guaranteed when mapping the image. Instead the raw image data will be presented.
  4. Applying this extension to images that are created without explicitly specifying the tiling - a) Most images will be considered to have optimal tiling. This is the default. When these images are mapped, a linear view of the image data will be presented. b) When an image is created from a buffer, the image is considered to have linear tiling if a row pitch and/or slice pitch are specified.

Thanks, Balaji

bcalidas avatar Apr 01 '23 01:04 bcalidas