onnxruntime_backend issues

Allow dynamic batch scheduler to be disabled when autocompleting config

6

**Is your feature request related to a problem? Please describe.** I'm serving a model that supports batching (`max_batch_size` > 0) and I would like to use config autocomplete, but I...

OvervCW

enhancement

TRT Engine Cache Regeneration Issue

4

**Is your feature request related to a problem? Please describe.** TRT cache gets regenerated whenever the model path changes. This is an issue when model file override is used. There...

Tabrizian

Engine files are not created using TensorRTExecutionProvider optimization when trt_engine_cache_enable is true on latest tritonserver releases (22.08)

**Description** Engine files are not created using TensorRTExecutionProvider optimization when trt_engine_cache_enable is true. The fact that `trt_engine_cache_path` is defined or not doesn't seem to change anything. **Triton Information** I tried...

GabrieldeBlois

OnnxLoader singleton has not been initialized

2

**Description** What have i done? - Building the Triton Inference Server on Ubuntu 20.04 without Docker - Building the Onnxruntime Backend The Error : I dont know exactly error but...

AhmetTelceken

Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

2

I need to run the Triton Server using an ONNX model that generates a TensorRT engine on-the-fly. I'm aware that I could use the trtexec utility to generate the TensorRT...

levipereira

Add support for sharing an ORT session

2

For every instance in a model instance group a new ORT session is created. This code adds support to share a session per instance group. This support can be enabled...

quic-suppugun

Failed to allocated memory for requested buffer of size X

1

So I was trying to deploy a custom model on the tritonserver(23.08) with the onnxruntime_backend(onnxruntime version 1.15.1). But while doing so, we are facing this issue: ``` onnx runtime error...

aaditya-srivathsan

Facing errors when installing onnxruntime backend for triton

/home/aniket/server/src/grpc_server.cc: In lambda function: /home/aniket/server/src/grpc_server.cc:826:24: error: narrowing conversion of ‘(int)byte_size’ from ‘int’ to ‘google::protobuf::stringpiece_internal::StringPiece::size_type’ {aka ‘long unsigned int’} [-Werror=narrowing] 826 | {buffer, (int)byte_size}, response->mutable_config()); | ^~~~~~~~~~~~~~ /home/aniket/server/src/grpc_server.cc: In instantiation of...

Aniket-20

CPU Throttling when Deploying Triton with ONNX Backend on Kubernetes

6

**Description** I am deploying a YOLOv8 model for object-detection using Triton with ONNX backend on Kubernetes. I have experienced significant CPU throttling in the sidecar container ("queue-proxy") which sits in...

langong347

Dynamic CUDA and TRT options updating

1

@pranavsharma this is solving #217 and some other issues that are asking for more option support. I have not tested this change, do you have any recommendation on how to...

gedoensmax

onnxruntime_backend
onnxruntime_backend copied to clipboard

Metadata

Allow dynamic batch scheduler to be disabled when autocompleting config

TRT Engine Cache Regeneration Issue

Engine files are not created using TensorRTExecutionProvider optimization when trt_engine_cache_enable is true on latest tritonserver releases (22.08)

OnnxLoader singleton has not been initialized

Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

Add support for sharing an ORT session

Failed to allocated memory for requested buffer of size X

Facing errors when installing onnxruntime backend for triton

CPU Throttling when Deploying Triton with ONNX Backend on Kubernetes

Dynamic CUDA and TRT options updating

← Metadata

Owner

Metadata

onnxruntime_backend onnxruntime_backend copied to clipboard

Metadata

← Metadata

Owner

Metadata

onnxruntime_backend
onnxruntime_backend copied to clipboard