onnxruntime_backend icon indicating copy to clipboard operation
onnxruntime_backend copied to clipboard

Allow Usage of Intel oneDNN EP For ONNX Backend

Open narolski opened this issue 3 years ago • 2 comments
trafficstars

Is your feature request related to a problem? Please describe. I would like to use the Intel oneDNN Execution Provider (EP) in ONNX Runtime built for Triton Inference Server ONNX Backend.

Describe the solution you'd like Ideally, the oneDNN EP should be enabled the same way we can enable the usage of OpenVino EP in model configuration:

optimization {
  execution_accelerators {
    cpu_execution_accelerator : [ {
      name : "openvino"
    } ]
  }
}

Describe alternatives you've considered I've tried to pass dnnl under cpu_execution_accelerator, but this is not supported.

oneDNN might yield greater performance improvements for CPU inference than OpenVino, that is why it would be great to be able to use it within the Triton Inference Server.

Update: Furthermore, it seems that onednn is enabled by default for ONNX Runtime wheel built with onednn over the default ONNX Runtime CPU Execution Provider:

When using the python wheel from the ONNX Runtime built with DNNL execution provider, it will be automatically prioritized over the CPU execution provider. Python APIs details are here.

Additional context ONNX Runtime documentation: https://fs-eire.github.io/onnxruntime/docs/execution-providers/oneDNN-ExecutionProvider.html

narolski avatar Jul 27 '22 08:07 narolski

@pranavsharma Do you think it will be possible to implement this configuration option?

narolski avatar Sep 23 '22 07:09 narolski

We've not planned for it yet. Would you like to contribute?

pranavsharma avatar Sep 23 '22 23:09 pranavsharma