iree [onnx] Build real onnx frontend cli/API

The current ONNX compilation flow is serviceable but somewhat user hostile, directly requiring too much low-level manipulation of the system. Recommend creating a real ONNX frontend API and CL which handles all of the following properly. This would replace the existing frontdoor here: https://iree.dev/guides/ml-frameworks/onnx/

Automatic version upgrades to the current minver (17). There are a lot of old models out there and we have only implemented >= 17 compatibility. We shouldn't be asking users to roll their own Python upgrader script.
Proper separation of parameters in the default path (perhaps with an option to leave everything inlined).
Serialize intermediates with MLIR bytecode vs text format.
Enable MLIR reproducers and document what users should do with them (will work much better if parameter separation is default).
Invokes the compiler pipeline by default vs needing the user to separate import/compile (can still have flags to do import only, etc).
Presents error messages with command lines for using lower level tools for repro/diagnostic purposes.
Exposed as a supported Python API with CLI wrapper.
When used for CLI, maybe some eye candy textui progress, etc.

For the record, when I took myself down this journey recently, I started with something simple and ended up with this: https://gist.github.com/stellaraccident/1b3366c129c3bc8e7293fb1353254407

Aug 19 '24 20:08 stellaraccident

Couple of more items needed:

Ability to set the entry-point name (this seems to be coming from some onnx graph id of some kind and seems all over the place: I see things like "torch-jit-export" when importing now)
Ability to set the module name (we often just call these "module" but that keeps them from co-existing when loading multiple into the same context);

Aug 20 '24 00:08 stellaraccident

Some of my own observations from a day of coding:

An authoritative data type enum mapping could be hosted in a common location, similar to https://onnx.ai/onnx/api/mapping.html.
A step beyond that would be mapping between function signatures (ONNX proto inputs/outputs) and IREE in-process runtime function signatures or iree-run-module / MLIR tooling signatures. Not sure at what level we'd want to share that code... might be test utils, but could also be useful for "real" usage.
I found myself writing files to disk that could stay in memory and sometimes loading the same file/model into memory in multiple places. Would be nice to have the option to stay fully in memory (ideally including parameter archives I guess?)
ONNX Runtime has its own data structures and bindings: https://onnxruntime.ai/docs/api/python/api_summary.html#data-inputs-and-outputs. May want to interop with those ... though perhaps that would only need to happen in an execution provider, and not the standalone/standard tools.

Sep 07 '24 00:09 ScottTodd