onnx-mlir
onnx-mlir copied to clipboard
inline function calls during onnx import?
it seems like we can inline all function calls during onnx import, both modal-local functions (see issue #1841) and custom functions as in CustomFnTest.cpp
this would simplify the import logic a little bit and will make it possible to remove ONNXCallOp giving us one less thing to analyze, transform, and convert in the compiler
see proof of concept implementation in PR #2242
Can we use inlining provided by MLIR after the function is imported into onnx-mlir? To me, the priority of issue is low.
Can we use inlining provided by MLIR after the function is imported into onnx-mlir?
maybe, but I think the protobuf-to-mlir import logic in FrontendDialectTransformer is simplest if we inline there instead of constructing the function call in mlir
To me, the priority of issue is low.
right, it's not a priority right now but that's partly because we haven't started implementing shape inference and lowering for function calls yet - if we inline now then we won't have to do any of that
High level comments: I know that the onnx standard wants to implement some of their newer operators as a function of existing ONNX ops. Not sure if it comes as a function or not, but clearly we may want to implement some of these higher level ops as is. For example, the LSTM operator could be represented as a function of smaller onnx ops, and we want to transform that LSTM into a LSTM op that we support as a single op, esp when pushing them to our NNPA accelerator which support some LSTM ops.
Again, apologies because I don't know how these ONNX complex ops comes into our protobufs, but if they come as a function, we will have to be selective as which one to inline. We also had a request for supporting non-standard ONNX ops that are implemented in ORT... so that is also a second source of such functions that may/may not be expanded.
Maybe someone could look into this issue and provide an explanation here for the benefit of the others. Any takers?
Note that the lack of polymorphic type/shape system and given the way we lower with fixed shapes means that it will only be practical to share functions between call sites with identical types and shapes. I expect that that will make it impractical to do anything other than inlining once we get around to do type inference and lowering for functions and function calls. Therefore I propose that we inline now. If we figure out polymorphic types and shapes and all that, then in the future we can revert the change and go back to calling functions.
remove ONNXCallOp giving us one less thing to analyze, transform, and convert in the compiler
I took a closer look at ONNXCallOp and it looks like a verbatim copy of func::CallOp
it seems like we could remove ONNXCallOp and just use func::CallOp, just like we use func::ReturnOp