onnxruntime_backend issues

ORT_DISABLE_ALL optimization level

1

**Is your feature request related to a problem? Please describe.** Our model uses a dropout layer that is removed by the ORT optimizer. It's unusual, but we use dropout in...

dryarullin

perf_analyzer failing with inputs -1 dimension

14

Perf analyzer failing to create `concurrency manager` with the following error: ``` [Model Analyzer] Running perf_analyzer failed with exit status 1 : error: failed to create concurrency manager: input attention_mask...

aishanibhalla

Not able to load simple iris model: Getting error: `Unsupported ONNX Type 'ONNX_TYPE_SEQUENCE'`

13

**Description** Getting an error `"failed to load 'model_onnx' version 1: Unsupported: Unsupported ONNX Type 'ONNX_TYPE_SEQUENCE' for I/O 'output_probability', expected 'ONNX_TYPE_TENSOR'"` **Triton Information** nvcr.io/nvidia/tritonserver:21.10-py3 Are you using the Triton container or...

KshitizLohia

model with triton inference server is 3x slower than the model in ORT directly (using gpu in both)

3

**Description** I run the model on triton inference server and also on ORT directly. Inference time on triton inference server is 3 ms, but it is 1 ms on ORT....

farzanehnakhaee70

CPU inference is much slower than with ONNX Runtime directly

10

**Description** Our Electra-based model takes about 540 ms per inference on CPU with ONNX Runtime (via the mcr.microsoft.com/azureml/onnxruntime:v1.4.0 container). The same model run through Triton r21.02 takes 1000+ ms on...

artmatsak

more-info-needed

Triton-OnnxRt- TRT performance i

4

**Description** I downloaded the yolov3 model weights from [here](https://pjreddie.com/media/files/yolov3.weights). Then using the Tensor-Rt sample [scripts](https://github.com/NVIDIA/TensorRT/tree/master/samples/python/yolov3_onnx#running-the-sample), I was able to get the corresponding onnx model file. The obtained onnx model file...

mayani-nv

Improve autocomplete to make it more robust against partial model configuration

**Is your feature request related to a problem? Please describe.** Currently, the auto-complete function does nothing if model config provides even a single output and input. See here: https://github.com/triton-inference-server/onnxruntime_backend/blob/main/src/onnxruntime.cc#L652-L680. The...

tanmayv25

Batch Support Error Triton ONNX Backend

7

**Description** Hello, I have an ONNX model. I am sharing the input and output dimensions of this model below. ![image](https://user-images.githubusercontent.com/81593133/161698185-65e50766-2697-49dc-909e-9adda2547b74.png) I need to deploy this model with Triton Inference Server....

sarperkilic

Yolov3 onnx model not load

5

**Description** Yolov3 onnx model not load **Triton Information** What version of Triton are you using? 2.20 Are you using the Triton container or did you build it yourself? Yes, version...

josephwnv

Expose all string key/value configs instead of doing it piecemeal.

**Is your feature request related to a problem? Please describe.** ORT exposes a bunch of string key/value configs here https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h but none of them are exposed by this backend. It...

pranavsharma

enhancement

onnxruntime_backend
onnxruntime_backend copied to clipboard

Metadata

ORT_DISABLE_ALL optimization level

perf_analyzer failing with inputs -1 dimension

Not able to load simple iris model: Getting error: `Unsupported ONNX Type 'ONNX_TYPE_SEQUENCE'`

model with triton inference server is 3x slower than the model in ORT directly (using gpu in both)

CPU inference is much slower than with ONNX Runtime directly

Triton-OnnxRt- TRT performance i

Improve autocomplete to make it more robust against partial model configuration

Batch Support Error Triton ONNX Backend

Yolov3 onnx model not load

Expose all string key/value configs instead of doing it piecemeal.

← Metadata

Owner

Metadata

onnxruntime_backend onnxruntime_backend copied to clipboard

Metadata

← Metadata

Owner

Metadata

onnxruntime_backend
onnxruntime_backend copied to clipboard