[Mobile] Supporting more operators for NNAPI and CoreML
Describe the issue
First of all, thanks everyone at ORT for providing such an amazingly valuable piece of software! We heavily rely on ORT for our ML applications.
Currently we have 4 ML models in production on our mobile apps (Android & iOS both) using ORT. However, none of these are fully supported by the NNAPI and CoreML EPs, each model having at least one unsupported operator that partitions the graph, forcing us to use the CPU EP. It would be great if more operators could be supported so that we can get faster inference on the more optimal EPs. Specifically, these are the operators we're missing:
NNAPI:
- Erf
- ReduceMin
- Round
- Shape
- NonMaxSuppression
- ReduceMax
- Resize for 1CHW (right now only 2D is supported)
- ReduceSum
- ArgMax
- DynamicQuantizeLinear
- LayerNormalization
- MatMulInteger
CoreML:
- ReduceMin
- Round
- Unsqueeze
- NonMaxSuppression
- Erf
- ReduceMax
- ReduceMean
- Exp
- Max
- DequantizeLinear
- DynamicQuantizeLinear
- LayerNormalization
- MatMulInteger
I realize this is quite a long list. So to narrow it down a bit, here are the operators belonging to our most production-critical two models, roughly ordered from most frequent to least frequent:
NNAPI:
- Erf
- ReduceMin
- Round
- Shape
- ReduceMax
- Resize for 1CHW (right now only 2D is supported)
- NonMaxSuppression
CoreML:
- Erf
- ReduceMin
- Round
- Unsqueeze
- ReduceMax
- ReduceMean
- NonMaxSuppression
It would be amazing if more operators could be added, thank you for the consideration!
Secondly, I have a small related question, hopefully it's okay to ask here. Are dynamic input shapes supported on these two EPs? Because this page states that they're not, but this response states that support has already been added. Thanks a bunch for your clarification! :)
To reproduce
Try to use any ONNX model with the above mentioned operators on NNAPI or CoreML.
Urgency
Not urgent, though it would be highly valuable for us.
Platform
Android
OS Version
Any Android or iOS OS
ONNX Runtime Installation
Released Package
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
onnxruntime-mobile
ONNX Runtime Version or Commit ID
1.19
ONNX Runtime API
Java/Kotlin
Architecture
ARM64
Execution Provider
NNAPI
Execution Provider Library Version
No response
Are dynamic input shapes supported on these two EPs? Because this page states that they're not, but https://github.com/microsoft/onnxruntime/issues/17654#issuecomment-1730371866 states that support has already been added.
The CoreML EP has support for dynamic shapes now. The NNAPI EP does not. That documentation page should be updated. Thanks for pointing it out.
Thanks for the clarification! So CoreML and CPU have support, only NNAPI does not. Is it planned for NNAPI at some point?
Also, is there any kind of roadmap/planning for adding more operator support on the mobile EPs? I read in another issue that operator support is added based on user demand, so is there any place I can track support of (above mentioned) mobile operators?
Is it planned for NNAPI at some point?
Unfortunately, no. NNAPI will be deprecated (see here). This limits our future plans for this EP.
Also, is there any kind of roadmap/planning for adding more operator support on the mobile EPs? I read in another issue that operator support is added based on user demand, so is there any place I can track support of (above mentioned) mobile operators?
For the ops you've mentioned, we can track support with this GitHub issue.
Big thanks to the devs for the contribution to the vibrant and thriving ML open-source community ❤️
Chiming in to boost the request for ReduceMean and Unsqueeze operators support in CoreML, the practical reason for this being that some of the insightface's ONNX-distributed models are currently incompatible with CoreML EP due to the lack of these ops, running them on CPU as a fallback (on both iOS and macOS) provides unsatisfying performance for video and live camera scenarios
Unfortunately, no. NNAPI will be deprecated (see here). This limits our future plans for this EP.
Interesting, thanks for sharing! Does this mean you'll consider supporting TFLite/LiteRT GPU instead at some point?
Some of CoreML operators was supported by https://github.com/microsoft/onnxruntime/pull/22710 https://github.com/microsoft/onnxruntime/pull/22480 https://github.com/microsoft/onnxruntime/pull/22068
Please reopen it if there are other ops required for CoreML