onnxruntime_backend icon indicating copy to clipboard operation
onnxruntime_backend copied to clipboard

The Triton backend for the ONNX Runtime.

Results 81 onnxruntime_backend issues
Sort by recently updated
recently updated
newest added
trafficstars

**Description** I would like to be able te replace the `libonnxruntime.so` binary (as well as associated ones) without rebuilding the entire backend, for easier experimentation / testing / debugging. There...

bug

**Description** We have seen regressions in terms of performance for an ONNX model: - using the ORT backend - with `Loop` and `Memcpy` nodes (the latter is probably the most...

bug

Is there a way to control (limit) the global GPU memory usage of the onnxruntime backend in triton? The tensorflow backend has the following CLI: ``` --backend-config tensorflow,gpu-memory-fraction=X ``` I...

**Description** I exported yolov7 detection model to onnx using this code https://github.com/WongKinYiu/yolov7/blob/main/export.py and deployed it to triton. It worked really well in normal case, but when model cant detect anything...

**Description** I've been trying various huggingface models on Triton using the ONNX Runtime backend. The models are first converted from huggingface to onnx using one of onnxruntime converters and then...

**Description** I've been trying various huggingface models on Triton using the ONNX Runtime backend. The models are first converted from huggingface to onnx using one of onnxruntime converters and then...

Our onnx models need onnxruntime version 1.10.0. I'm using triton server version 22.08 which has onnxruntime version 1.11.1. How can i use the required version?

**Description** When attempting to launch a model converted to ONNX with `convert_sklearn`, the model fails to load with this error: ``` UNAVAILABLE: Internal: onnx runtime error 6: Exception during initialization:...

Hey all, I have a quick question, is onnxruntime-genai ([https://onnxruntime.ai/docs/genai/api/python.html](https://onnxruntime.ai/docs/genai/api/python.html)) supported in Triton Inference Server's ONNX runtime backend? I couldn't find relevant sources in the documentation. Thanks in advance!