BentoML
BentoML copied to clipboard
feat: ONNX-mlir
Provide supports for ONNXMlir as a first-class citizen
TODO:
- [x] Use provided docker container for compilation
- [x] provides envar
ONNX_MLIR_DIR
if users build ONNX_MLIR locally - [x] need a way to safely add PyRuntime to
sys.path
- [x] support
multi_entrypoint
compiled model. - [ ] mount docker container and use prebuilt binary
- This means user don't have to go through the pain of setting up onnx-mlir.
Maybe provides support for older ExecutionSession
for older onnx-mlir API. Since this library is pretty obscure I assume everyone are probably living on the edge and install the latest changes anyway.
This will be looked after 1.0 releases, as ONNXMlir is not a top list item.
@andrewsi-z for your awareness
Sounds good! @aarnphm with 'use the docker container as an entrypoint for onnx-mlir' I assume this would be to take an ONNX model as input and handle compile as part of the abstraction layer rather than require the compile as a prerequisite step (which the current approach relies on)?
that is correct. We will allow any .onnx file format and we will use the docker container to compile the model.
I'm thinking to also provide a helper layer where users can put any model file (let say tensorflow model format, or torch model format) and then convert it to onnx. But imo that is outside the scope abit. We could do something like that for ONNX as well in the future.
@aarnphm I wonder if we want to convert tf/pt model to onnx for users, there're a lot of fine tuning can happen in this process. Better give user some references in documentations
you are right. We probably want to let the user decide. However, is it possible that we can provide an optional default?