onnxruntime Allow link to system-installed onnx.

When using shared protobuf, both onnx and ort register to the same place. This patch ensures both sides go through the same proto .so and only load once.

Aug 03 '22 12:08 xkszltl

This is a new feature request. Please let our PMs know why it is needed. I can't make any move before having a formal discussion with the PMs and dev managers.

Sep 07 '22 04:09 snnn

Thanks for the update.

Please let our PMs know why it is needed.

@snnn Could you help driving the discussion and clarify what kind of "why" not covered by the current PR description is missing? You may include the PM in this thread.

Sep 07 '22 05:09 xkszltl

The background here is if we load onnx+ort, especially the one built with shared pb, together into memory, they'll conflict because both carry onnx schemas under the same name and reg to the same registry. By link to the shared system onnx, it'll only load once by loader to resolve symbols without duplication.

Sep 07 '22 05:09 xkszltl

/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux Nuphar CI Pipeline,Linux OpenVINO CI Pipeline

Sep 08 '22 23:09 skottmckay

/azp run MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline

Sep 08 '22 23:09 skottmckay

Azure Pipelines successfully started running 7 pipeline(s).

Sep 08 '22 23:09 azure-pipelines[bot]

Azure Pipelines successfully started running 9 pipeline(s).

Sep 08 '22 23:09 azure-pipelines[bot]

@faxu, any update please?

Sep 15 '22 17:09 wschin

Hi @xkszltl and thank you for your suggested contribution. In order to keep the codebase manageable, we request that these 4 criteria are met:

Confirm that this change does not impact the regular/standard workflow.
Confirm that this change does not make the code unnecessarily complex and unmaintainable to solve, given that this change is only for one use case.
Ensure that here are tests to validate the changes (wherever possible). Please elaborate on how we can test this to ensure it's correctly working, and whether it's possible to automate this test.
Ensure that the feature is documented so that it's clear how others can benefit from this as well. Please add this to your PR through code annotation.

Sep 16 '22 20:09 faxu

Depends on the definition of "standard", could you clarify? More specific, this changes the onnx used when -D onnxruntime_PREFER_SYSTEM_LIB=ON, because ort currently doesn't have finer-grain switches. But is onnxruntime_PREFER_SYSTEM_LIB used by the "standard workflow"?
It doesn't, there's only 15 lines of non-whitespace code, to add a extra branch. And it's more of a bugfix than a feature, so not gonna be "unnecessarily". Actually I'm seen multiple requests on the related changes when using shared protobuf.
At this point it's tested in our side. I would certainly love to have CI in ort as well but it depends on custom build of onnx, which I don't think ort has in CI at this point? Any suggestions on that?
Will do. Regarding "code annotation", you mean comment in code or on github?

Sep 16 '22 22:09 xkszltl

@snnn / @pranavsharma can you review comments 1 and 3?

For 4 - describing this as a comment in the code should be ok for a paper trail, unless you have another suggestion for a good place to document this and how to use.

Sep 16 '22 23:09 faxu

Depends on the definition of "standard", could you clarify? More specific, this changes the onnx used when -D onnxruntime_PREFER_SYSTEM_LIB=ON, because ort currently doesn't have finer-grain switches. But is onnxruntime_PREFER_SYSTEM_LIB used by the "standard workflow"?

We don't use onnxruntime_PREFER_SYSTEM_LIB in the regular workflow.

At this point it's tested in our side. I would certainly love to have CI in ort as well but it depends on custom build of onnx, which I don't think ort has in CI at this point? Any suggestions on that?

We won't have the resources to create a CI on our side.

Sep 16 '22 23:09 pranavsharma

We regularly need to refactor our code. So, when we make changes, how do we verify if this still works? If you could give us some instructions, it would be very helpful.

Sep 17 '22 02:09 snnn

The workflow would be to compile protobuf as shared lib, and compile onnx to use that. These are the script we used for that:

https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/protobuf.sh#L47-L65
https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/onnx.sh#L112-L133
https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/ort.sh#L106-L153

and to test the build, you can simply load both libs from onnx+ort, or if python parts are built as in the rest part of script, we simple run:

https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/ort.sh#L215-L232

Since you don't have CI resources for coverage and the related parts was never covered, I would suggest to merge this without testing first, as that would at least unblock people depending on it.

Oct 16 '22 12:10 xkszltl

onnxruntime onnxruntime copied to clipboard

Allow link to system-installed onnx.

onnxruntime
onnxruntime copied to clipboard