onnxruntime_backend issues

Use special ORT branch 'tensorrt-8.5ea' brought in by the ORT 1.12.1 release to make use of the built-in tensorrt parser.

1

Use special ORT branch 'tensorrt-8.5ea' brought in by the ORT 1.12.1 release to make use of the built-in tensorrt parser.

pranavsharma

Add support for sharing an ORT session

For every instance in a model instance group a new ORT session is created. This code adds support to share a session per instance group. This support can be enabled...

quic-suppugun

Disable `enable_mem_pattern` not working

1

**Description** Hi all, recently I encountered an issue when I implemented the onnx model, it would **consume too much memory**. Please check the [issue](https://github.com/microsoft/onnxruntime/issues/1725) that it seems a feature of...

chiehpower

How to inference with model(onnx) converted by MMdeploy?

I am trying to use MMpose in the Nvidia triton server but it does not support PyTorch model, it supports torchscript and ONNX, and a few others. So, I have...

Monalsingh

onnxruntime inference seesion params setting

Background: My onnx model include `Dropout`, which is executed as `training` mode.However, onnxruntime will optimize `Dropout` ops by default.So, I calls `session = ort.InferenceSession(modelPath, disabled_optimizers=["EliminateDropout"])` to avoid that. Question: What...

zhaozhiming37

Allow Usage of Intel oneDNN EP For ONNX Backend

2

**Is your feature request related to a problem? Please describe.** I would like to use the Intel oneDNN Execution Provider (EP) in ONNX Runtime built for Triton Inference Server ONNX...

narolski

Memory Leak When Using ONNXRuntime With OpenVino EP

6

**Description** Using the same model as in #102, the Triton Inference Server has a memory leak, as observed by `docker stats`, after adding: ``` execution_accelerators { cpu_execution_accelerator : [ {...

narolski

Shared weights whenever multiple instances

10

Also CPU

DavidLangworthy

medium

Cannot build r22.03 onnxruntime_backend with tensorrt

2

**Description** I was unable to build the onnxruntime_backend with OpenVino for Triton Inference Server r22.03 using compatible ONNXRuntime and tensorrt versions (from Triton Inference Server compatibility matrix). **Triton Information** r22.03...

ZJU-lishuang

Allow to specify tensorrt cache path per version

When using onnx with tensorrt it saves a lot of time to use the tensorrt cache path. The drawback is that onnxruntime is not smart enough to avoid using the...

fran6co

onnxruntime_backend
onnxruntime_backend copied to clipboard

Metadata

Use special ORT branch 'tensorrt-8.5ea' brought in by the ORT 1.12.1 release to make use of the built-in tensorrt parser.

Add support for sharing an ORT session

Disable `enable_mem_pattern` not working

How to inference with model(onnx) converted by MMdeploy?

onnxruntime inference seesion params setting

Allow Usage of Intel oneDNN EP For ONNX Backend

Memory Leak When Using ONNXRuntime With OpenVino EP

Shared weights whenever multiple instances

Cannot build r22.03 onnxruntime_backend with tensorrt

Allow to specify tensorrt cache path per version

← Metadata

Owner

Metadata

onnxruntime_backend onnxruntime_backend copied to clipboard

Metadata

← Metadata

Owner

Metadata

onnxruntime_backend
onnxruntime_backend copied to clipboard