onnxruntime_backend issues

Tested with OV 2024.X. This PR should be merged after ORT upgraded to 1.18.

OpenVINO EP doesn't respect threading parameters

**Description** In ONNXRuntime, the OpenVINO EP accepts configuration options to set the number of threads and number of streams documented [here](https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#cc-api-20), but these are ignored when passed to the EP...

mbahri

UNAVAILABLE: Unsupported: Triton TRITONBACKEND API version: X does not support 'onnxruntime' TRITONBACKEND API version X

I followed the README instructions on compilation and at the end I faced the `UNAVAILABLE: Unsupported: Triton TRITONBACKEND API version: 1.16 does not support 'onnxruntime' TRITONBACKEND API version: 1.19` error...

pultarmi

build: Add WAR for CUDA 12.5 build issue (#257)

Bringing this to `main` branch as well since current main pipelines are targeting CUDA 12.5

rmccorm4

Use pre-built ONNX Runtime binaries.

server: https://github.com/triton-inference-server/server/pull/7717

mc-nv

Triton ONNX runtime backend slower than onnxruntime python client on CPU

7

**Description** When deploying an ONNX model using the Triton Inference Server's ONNX runtime backend, the inference performance on the CPU is noticeably slower compared to running the same model using...

Mitix-EPI

onnxruntime_backend
onnxruntime_backend copied to clipboard

Metadata

update to use logging macros for JSON values

Remove openvino hard-coded versioning in onnxruntime backend build

OpenVINO EP doesn't respect threading parameters

UNAVAILABLE: Unsupported: Triton TRITONBACKEND API version: X does not support 'onnxruntime' TRITONBACKEND API version X

build: Add WAR for CUDA 12.5 build issue (#257)

Use pre-built ONNX Runtime binaries.

Triton ONNX runtime backend slower than onnxruntime python client on CPU

Deploy TTS model with Triton and onnx backend, failed:Protobuf parsing failed

[Question] Multiple model inputs and GPU allocations

load_model fail cause gpu memory leak

← Metadata

Owner

Metadata

onnxruntime_backend onnxruntime_backend copied to clipboard

Metadata

← Metadata

Owner

Metadata

onnxruntime_backend
onnxruntime_backend copied to clipboard