iree
iree copied to clipboard
[onnx] Build real onnx frontend cli/API
The current ONNX compilation flow is serviceable but somewhat user hostile, directly requiring too much low-level manipulation of the system. Recommend creating a real ONNX frontend API and CL which handles all of the following properly. This would replace the existing frontdoor here: https://iree.dev/guides/ml-frameworks/onnx/
- Automatic version upgrades to the current minver (17). There are a lot of old models out there and we have only implemented >= 17 compatibility. We shouldn't be asking users to roll their own Python upgrader script.
- Proper separation of parameters in the default path (perhaps with an option to leave everything inlined).
- Serialize intermediates with MLIR bytecode vs text format.
- Enable MLIR reproducers and document what users should do with them (will work much better if parameter separation is default).
- Invokes the compiler pipeline by default vs needing the user to separate import/compile (can still have flags to do import only, etc).
- Presents error messages with command lines for using lower level tools for repro/diagnostic purposes.
- Exposed as a supported Python API with CLI wrapper.
- When used for CLI, maybe some eye candy textui progress, etc.
For the record, when I took myself down this journey recently, I started with something simple and ended up with this: https://gist.github.com/stellaraccident/1b3366c129c3bc8e7293fb1353254407
Couple of more items needed:
- Ability to set the entry-point name (this seems to be coming from some onnx graph id of some kind and seems all over the place: I see things like "torch-jit-export" when importing now)
- Ability to set the module name (we often just call these "module" but that keeps them from co-existing when loading multiple into the same context);
Some of my own observations from a day of coding:
- An authoritative data type enum mapping could be hosted in a common location, similar to https://onnx.ai/onnx/api/mapping.html.
- A step beyond that would be mapping between function signatures (ONNX proto inputs/outputs) and IREE in-process runtime function signatures or
iree-run-module/ MLIR tooling signatures. Not sure at what level we'd want to share that code... might be test utils, but could also be useful for "real" usage. - I found myself writing files to disk that could stay in memory and sometimes loading the same file/model into memory in multiple places. Would be nice to have the option to stay fully in memory (ideally including parameter archives I guess?)
- ONNX Runtime has its own data structures and bindings: https://onnxruntime.ai/docs/api/python/api_summary.html#data-inputs-and-outputs. May want to interop with those ... though perhaps that would only need to happen in an execution provider, and not the standalone/standard tools.