onnxruntime
onnxruntime copied to clipboard
Allow link to system-installed onnx.
When using shared protobuf, both onnx and ort register to the same place.
This patch ensures both sides go through the same proto .so
and only load once.
This is a new feature request. Please let our PMs know why it is needed. I can't make any move before having a formal discussion with the PMs and dev managers.
Thanks for the update.
Please let our PMs know why it is needed.
@snnn Could you help driving the discussion and clarify what kind of "why" not covered by the current PR description is missing? You may include the PM in this thread.
The background here is if we load onnx+ort, especially the one built with shared pb, together into memory, they'll conflict because both carry onnx schemas under the same name and reg to the same registry. By link to the shared system onnx, it'll only load once by loader to resolve symbols without duplication.
/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux Nuphar CI Pipeline,Linux OpenVINO CI Pipeline
/azp run MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline
Azure Pipelines successfully started running 7 pipeline(s).
Azure Pipelines successfully started running 9 pipeline(s).
@faxu, any update please?
Hi @xkszltl and thank you for your suggested contribution. In order to keep the codebase manageable, we request that these 4 criteria are met:
- Confirm that this change does not impact the regular/standard workflow.
- Confirm that this change does not make the code unnecessarily complex and unmaintainable to solve, given that this change is only for one use case.
- Ensure that here are tests to validate the changes (wherever possible). Please elaborate on how we can test this to ensure it's correctly working, and whether it's possible to automate this test.
- Ensure that the feature is documented so that it's clear how others can benefit from this as well. Please add this to your PR through code annotation.
- Depends on the definition of "standard", could you clarify? More specific, this changes the onnx used when
-D onnxruntime_PREFER_SYSTEM_LIB=ON
, because ort currently doesn't have finer-grain switches. But isonnxruntime_PREFER_SYSTEM_LIB
used by the "standard workflow"? - It doesn't, there's only 15 lines of non-whitespace code, to add a extra branch. And it's more of a bugfix than a feature, so not gonna be "unnecessarily". Actually I'm seen multiple requests on the related changes when using shared protobuf.
- At this point it's tested in our side. I would certainly love to have CI in ort as well but it depends on custom build of onnx, which I don't think ort has in CI at this point? Any suggestions on that?
- Will do. Regarding "code annotation", you mean comment in code or on github?
@snnn / @pranavsharma can you review comments 1 and 3?
For 4 - describing this as a comment in the code should be ok for a paper trail, unless you have another suggestion for a good place to document this and how to use.
- Depends on the definition of "standard", could you clarify? More specific, this changes the onnx used when
-D onnxruntime_PREFER_SYSTEM_LIB=ON
, because ort currently doesn't have finer-grain switches. But isonnxruntime_PREFER_SYSTEM_LIB
used by the "standard workflow"?
We don't use onnxruntime_PREFER_SYSTEM_LIB in the regular workflow.
- At this point it's tested in our side. I would certainly love to have CI in ort as well but it depends on custom build of onnx, which I don't think ort has in CI at this point? Any suggestions on that?
We won't have the resources to create a CI on our side.
We regularly need to refactor our code. So, when we make changes, how do we verify if this still works? If you could give us some instructions, it would be very helpful.
The workflow would be to compile protobuf as shared lib, and compile onnx to use that. These are the script we used for that:
- https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/protobuf.sh#L47-L65
- https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/onnx.sh#L112-L133
- https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/ort.sh#L106-L153
and to test the build, you can simply load both libs from onnx+ort, or if python parts are built as in the rest part of script, we simple run:
- https://github.com/xkszltl/Roaster/blob/b6e950c0a67eac014a1dcff8f084ce71e03f6364/pkgs/ort.sh#L215-L232
Since you don't have CI resources for coverage and the related parts was never covered, I would suggest to merge this without testing first, as that would at least unblock people depending on it.