Onnx backend nout found
Description
I get the the error: "failed to load 'ambernet' version 1: Invalid argument: unable to find backend library for backend 'onnxruntime', try specifying runtime on the model configuration."
Triton Information I am using the latest tensorrt-llm container for 25.06, from my understanding this should include onnxruntime 1.22. Where is my error? Or do I have to build onnx backend from souce always and this is just a package I can use in my python models? If so are there expected performance issues in comparison to the onnx backend vs for example using onnx via pip and python?
Are you using the Triton container or did you build it yourself?
To Reproduce Just use the container and try to start an onnx model from config.pbtxt with backend: "onnxruntime"
@protonicage your tensorrt-llm image does not have ONNX backend. Check this from the docs: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver#:~:text=The%20xx.yy%2Dtrtllm%2Dpython%2Dpy3%20image%20contains%20the%20Triton%20Inference%20Server%20with%20support%20for%20TensorRT%2DLLM%20and%20Python%20backends%20only.
If you have an ONNX file, loading it in onnxruntime backend is recommended as that will be faster and better memory-optimized than loading it in Python backend.