iree icon indicating copy to clipboard operation
iree copied to clipboard

[onnx] Build real onnx frontend cli/API

Open stellaraccident opened this issue 1 year ago • 1 comments

The current ONNX compilation flow is serviceable but somewhat user hostile, directly requiring too much low-level manipulation of the system. Recommend creating a real ONNX frontend API and CL which handles all of the following properly. This would replace the existing frontdoor here: https://iree.dev/guides/ml-frameworks/onnx/

  • Automatic version upgrades to the current minver (17). There are a lot of old models out there and we have only implemented >= 17 compatibility. We shouldn't be asking users to roll their own Python upgrader script.
  • Proper separation of parameters in the default path (perhaps with an option to leave everything inlined).
  • Serialize intermediates with MLIR bytecode vs text format.
  • Enable MLIR reproducers and document what users should do with them (will work much better if parameter separation is default).
  • Invokes the compiler pipeline by default vs needing the user to separate import/compile (can still have flags to do import only, etc).
  • Presents error messages with command lines for using lower level tools for repro/diagnostic purposes.
  • Exposed as a supported Python API with CLI wrapper.
  • When used for CLI, maybe some eye candy textui progress, etc.

For the record, when I took myself down this journey recently, I started with something simple and ended up with this: https://gist.github.com/stellaraccident/1b3366c129c3bc8e7293fb1353254407

stellaraccident avatar Aug 19 '24 20:08 stellaraccident

Couple of more items needed:

  • Ability to set the entry-point name (this seems to be coming from some onnx graph id of some kind and seems all over the place: I see things like "torch-jit-export" when importing now)
  • Ability to set the module name (we often just call these "module" but that keeps them from co-existing when loading multiple into the same context);

stellaraccident avatar Aug 20 '24 00:08 stellaraccident

Some of my own observations from a day of coding:

  • An authoritative data type enum mapping could be hosted in a common location, similar to https://onnx.ai/onnx/api/mapping.html.
  • A step beyond that would be mapping between function signatures (ONNX proto inputs/outputs) and IREE in-process runtime function signatures or iree-run-module / MLIR tooling signatures. Not sure at what level we'd want to share that code... might be test utils, but could also be useful for "real" usage.
  • I found myself writing files to disk that could stay in memory and sometimes loading the same file/model into memory in multiple places. Would be nice to have the option to stay fully in memory (ideally including parameter archives I guess?)
  • ONNX Runtime has its own data structures and bindings: https://onnxruntime.ai/docs/api/python/api_summary.html#data-inputs-and-outputs. May want to interop with those ... though perhaps that would only need to happen in an execution provider, and not the standalone/standard tools.

ScottTodd avatar Sep 07 '24 00:09 ScottTodd