onnxruntime_backend
onnxruntime_backend copied to clipboard
The Triton backend for the ONNX Runtime.
**Is your feature request related to a problem? Please describe.** I'm serving a model that supports batching (`max_batch_size` > 0) and I would like to use config autocomplete, but I...
**Is your feature request related to a problem? Please describe.** TRT cache gets regenerated whenever the model path changes. This is an issue when model file override is used. There...
**Description** Engine files are not created using TensorRTExecutionProvider optimization when trt_engine_cache_enable is true. The fact that `trt_engine_cache_path` is defined or not doesn't seem to change anything. **Triton Information** I tried...
**Description** What have i done? - Building the Triton Inference Server on Ubuntu 20.04 without Docker - Building the Onnxruntime Backend The Error : I dont know exactly error but...
I need to run the Triton Server using an ONNX model that generates a TensorRT engine on-the-fly. I'm aware that I could use the trtexec utility to generate the TensorRT...
For every instance in a model instance group a new ORT session is created. This code adds support to share a session per instance group. This support can be enabled...
So I was trying to deploy a custom model on the tritonserver(23.08) with the onnxruntime_backend(onnxruntime version 1.15.1). But while doing so, we are facing this issue: ``` onnx runtime error...
/home/aniket/server/src/grpc_server.cc: In lambda function: /home/aniket/server/src/grpc_server.cc:826:24: error: narrowing conversion of ‘(int)byte_size’ from ‘int’ to ‘google::protobuf::stringpiece_internal::StringPiece::size_type’ {aka ‘long unsigned int’} [-Werror=narrowing] 826 | {buffer, (int)byte_size}, response->mutable_config()); | ^~~~~~~~~~~~~~ /home/aniket/server/src/grpc_server.cc: In instantiation of...
**Description** I am deploying a YOLOv8 model for object-detection using Triton with ONNX backend on Kubernetes. I have experienced significant CPU throttling in the sidecar container ("queue-proxy") which sits in...
@pranavsharma this is solving #217 and some other issues that are asking for more option support. I have not tested this change, do you have any recommendation on how to...