onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

[WebNN EP] Support int64 output data type for CoreML backend

Open Honry opened this issue 1 year ago • 2 comments

Describe the feature request

WebNN CoreML backend doesn't support int64 data type, however some ops from ONNX produce int64 output, e.g. ArgMax, ArgMin, etc., CoreML's AragMax reproduces int32 output.

That means we should check the dimension size being reduced is within int32 range, then do type casting (int32 -> int64) for the output.

The node of such op must be the output of a subgraph model, as its next node is int64 input which is not supported by CoreML backend, and it will fall back, unless it is a special case: ArgMax-Cast (from int64 to int32).

Following actions can be taken into account:

  • Use WebNN's opSupportLimits() to check if whether int64 data type is supported
  • Make sure producing int32 output instead is safe
  • Fuse ops for e.g. ArgMax-Cast(int64->int32)
  • Convert int32 output tensor back to int64

Besides, how CoreML EP handles int64 data type would be a good reference.

Describe scenario use case

N/A

Honry avatar Jul 18 '24 02:07 Honry

how CoreML EP handles int64 data type would be a good reference

Indeed, I really wonder given all indices are int64 in ONNX.

fdwr avatar Jul 18 '24 05:07 fdwr

CoreML EP converts all int64 attribute and initializer values to int32 when creating the CoreML model (and checks for overflow errors as it does it).

it also tracks if it needs to convert specific inputs/outputs between int64 and int32 when executing the CoreML model.

Once you have the attributes, initializers, and coreml model inputs as int32 the internals of the coreml model will produce int32 values, and we just need to convert the output from the coreml model back to int64 if applicable.

skottmckay avatar Aug 27 '24 10:08 skottmckay

Thank you @skottmckay, that's really helpful!

  • For initializer conversion, the code is here, right? Looks like most of them are written into local weight files, then the conversion from int64 to int32 was handled by CoreML, right?

  • For input conversion (int64->int32), the code is here, right? How does it handle the data overflow for int64 inputs? For WebNN EP we need to use cast op to convert the int64 inputs to int32, it is not safe if there's data overflow exists.

Honry avatar Aug 30 '24 08:08 Honry