onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

Adding ONNX Runtime C-API for WebGPU EP

Open fs-eire opened this issue 1 year ago • 4 comments

Description

This PR is for the API change for WebGPU EP.

This PR is for review purpose. The build may fail because of lack implementation of the interface.

Use scenarios

There are 2 scenarios:

  • users use the "default" device for WebGPU. in this case, they don't need to pass in any pointers. WEBGPU EP will maintain a map internally and the default device will be managed by WEBGPU EP
  • user want to use a custom device. they are responsible to create the device using the WebGPU API ( this is from the WebGPU public header) and manage the lifecycle of the objects. It is required for users to make sure it's available during the whole life-cycle of InferenceSession instance

design

Introducing this Device ID concept (different from the one that is inside MemoryInfo):

  • Device ID = 0 means the default context. User no need to pass any value.
  • Device ID > 0 means a custom context, and the ID is used as a unique key to retrieve the cached context.

There is a concern to pass pointers through string: the integer serialization and deserialization is difficult to enforce using the same set of API, which means it's error prone here. And it is also inefficient. So WebGPU EP options include 2 parts:

  • struct OrtWebGPUProviderOptions: contains device description. including device ID, handles of WebGPUInstance, WebGPUAdapter and WebGPUDevice. All field being zeroed value for default context.
  • a string based key-value pair: all other extra information. including:
    Key Possible Values Default Value
    "preferredLayout" "NHWC" or "NCHW" "NHWC"
    "enableGraphCapture" "1" or "0" "0"
    "storageBufferCacheMode" "disabled", "lazyRelease", "simple", "bucket" "bucket"
    "uniformBufferCacheMode" "disabled", "lazyRelease", "simple", "bucket" "lazyRelease"
    "queryResolveBufferCacheMode" "disabled", "lazyRelease", "simple", "bucket" "disabled"
    "defaultBufferCacheMode" "disabled", "lazyRelease", "simple", "bucket" "disabled"

Consideration

  • the struct OrtWebGPUProviderOptions may not be modified once released for ABI compatibility considerations. So try to put minimized items in this and use the string based key-value-pair as much as possible. for example, the buffer cache mode may extend in future and this does not need to modify the API.
  • may want to support creating WebGPU EP via SessionOptionsAppendExecutionProvider() for default device as well (all options are string based in this use case)

Example:

Use default device:

  ...

  OrtWebGPUProviderOptions webgpu_options{};

  ...

  std::vector<std::string> keys { "storageBufferCacheMode" };
  std::vector<std::string> values { "simple" };

  auto status = SessionOptionsAppendExecutionProvider_WebGPU(
                  &session_options,
                  &webgpu_options,
                  keys.data(),
                  values.data(),
                  1);

Use custom device:

  ...

  WGPUInstance instance;
  WGPUAdapter adapter;
  WGPUDevice device;
  CreateWebGpuHandles(..., &instance, &adapter, &device);

  OrtWebGPUProviderOptions webgpu_options{
    1,  //device ID
    instance,
    adapter,
    device
  };

  ...

  std::vector<std::string> keys { "storageBufferCacheMode" };
  std::vector<std::string> values { "simple" };

  auto status = SessionOptionsAppendExecutionProvider_WebGPU(
                  &session_options,
                  &webgpu_options,
                  keys.data(),
                  values.data(),
                  1);

fs-eire avatar Aug 23 '24 07:08 fs-eire

went quickly over it - lgtm in general and will go over it in more detail in a little. CI is unhappy.

guschmue avatar Aug 23 '24 23:08 guschmue

went quickly over it - lgtm in general and will go over it in more detail in a little. CI is unhappy.

thank you for the review. I am waiting for the answer of whether/how should pass the proc table information before finalizing the interface.

The build is broken - yes because the implementation is not included in the build. I will add a stub implementation.

fs-eire avatar Aug 24 '24 03:08 fs-eire

We need to pass this table: https://source.chromium.org/chromium/chromium/src/+/main:out/webview-Debug/gen/third_party/dawn/include/dawn/dawn_proc_table.h;drc=c76cca217f4278f5c53a8d90f7870270ee4dd81e;l=26

Knowing about DawnProcTable in onnxruntime is not practical so maybe it would need to be void * and cast.

Not great. We could #ifdef acting on that that field because we need it only in one specific scenario.

guschmue avatar Aug 26 '24 15:08 guschmue

Updated with proc table.

We could #ifdef acting on that that field because we need it only in one specific scenario.

It maybe not a good idea to make struct size based on this certain macro considering it's a part of API. But to make users easier to use, I put the proc table member as the last member in the struct. They can still do

OrtWebGPUProviderOptions webgpu_options{};

or

OrtWebGPUProviderOptions webgpu_options{
    1,  //device ID
    instance,
    adapter,
    device
  };

fs-eire avatar Aug 28 '24 04:08 fs-eire

updated according to comments. PTAL @gyagp @guschmue

do you think there is a better name for this device_id?

fs-eire avatar Aug 29 '24 21:08 fs-eire