onnx-mlir Initialize conversion passes from ONNX to Torch-MLIR backend contract

Includes LLVM lit tests for ONNXAddOp and ONNXConstantOp. This PR is a result of the RFC discussed in #1639. The following overview of the PR is collected from there, but see the RFC for a more detailed discussion. This includes four passes in the conversion pipeline:

"convert-onnx-to-torch" handles the op by op lowering to torch dialect
"convert-function-types-to-torch-types" converts function arguments to torch types (e.g. torch.vtensor) that wasn't converted in the previous pass
"finalize-torch-type-conversion" finalizes the conversion to torch types and appropriately removes any UnrealizedCastConversionOp's
"erase-onnx-entry-point" right now this entry point op gets removed for compatibility with torch-mlir.

As requested there is an option to disable the torch-mlir build with ONNX_MLIR_ENABLE_TORCH. One point of awkwardness here though comes from torch-mlir including llvm-project as a submodule. Thus when adding torch-mlir to onnx-mlir as a submodule, cloning recursively will clone the entirety of llvm as well (it doesn't build it as we are only building the parts of torch-mlir that we need). Suggestions on how to manage the submodules in this case, as well as any comments on if/how to break up this PR are welcome.

If there is anything in particular needed to help with reviewing here please let me know.

Sep 23 '22 16:09 qedawkins

Can one of the admins verify this patch?

Sep 23 '22 16:09 jenkins-droid

Thank you for the great work! You could probably configure a shallow submodule clone.

git config -f .gitmodules submodule.third_party/torch-mlir.shallow true

Sep 23 '22 19:09 ghost

Thank you for the great work! You could probably configure a shallow submodule clone.

Hmmm as far as I can tell a shallow submodule won't prevent the full llvm-project submodule in Torch-MLIR from being cloned when cloning with

git clone --recursive https://github.com/onnx/onnx-mlir.git

Correct me if I'm wrong but it didn't work when I tried your suggestion. I opened an issue in Torch-MLIR about formal support for the use case here (https://github.com/llvm/torch-mlir/issues/1411).

Sep 23 '22 21:09 qedawkins

@jenkins-droid test this please

Sep 27 '22 01:09 tungld

It looks like CMake on Windows isn't able to find the Torch-MLIR external dialects (TMTensor). I'll have to set up on Windows to iron out the CI problems.

Sep 27 '22 01:09 qedawkins

Can one of the admins verify this patch?

Sep 27 '22 01:09 jenkins-droid

I can convert this PR to a draft until I verify that the Windows build works, but style comments/reviews are still welcome at this point.

Sep 27 '22 01:09 qedawkins

You need to abide by DCO. See Contributing section of main Readme.md. Every commits need to be signed. It can be overridden on this page (details next to DCO) but this should be used only exceptionally.

Sep 27 '22 16:09 AlexandreEichenberger

@jenkins-droid test this please

Sep 27 '22 16:09 AlexandreEichenberger

Can one of the admins verify this patch?

Oct 03 '22 17:10 jenkins-droid

@jenkins-droid test this please

Oct 03 '22 17:10 AlexandreEichenberger

I pushed a new change that changes to a temporary fork of Torch-MLIR that will be used for iterating this PR without needing to create PRs in Torch-MLIR for each small change. Once the reviewing is done here we can switch back to the main branch of Torch-MLIR. With this I was able to build successfully in the Docker image used by the CI with all unit tests passing, but I haven't tested locally on all platforms.

With this, if there are more CI failures, especially as this PR is reviewed, I am wondering if there is a way to set up so that running the CI doesn't require admin approval each time to allow for faster iteration (perhaps on a temporary branch within ONNX-MLIR). @AlexandreEichenberger if you or one of the other admins could help/comment here that would be greatly appreciated! If this isn't possible then no problem, but I figured I would ask.

Oct 03 '22 17:10 qedawkins

Can one of the admins verify this patch?

Oct 03 '22 20:10 jenkins-droid

@jenkins-droid test this please

Oct 04 '22 13:10 AlexandreEichenberger

Can one of the admins verify this patch?

Oct 07 '22 22:10 jenkins-droid

@jenkins-droid test this please

Oct 08 '22 08:10 gongsu832

@jenkins-droid test this please

Oct 18 '22 15:10 AlexandreEichenberger

onnx-mlir onnx-mlir copied to clipboard

Initialize conversion passes from ONNX to Torch-MLIR backend contract

onnx-mlir
onnx-mlir copied to clipboard