server icon indicating copy to clipboard operation
server copied to clipboard

Add support for loading onnx files with the tensorRT backend

Open fran6co opened this issue 2 years ago • 3 comments

Describe the solution you'd like Be able to just use an onnx file in the tensorrt backend given that tensorrt has a onnx parser. It would build the engine on warmup and cache it.

Describe alternatives you've considered Using the onnxruntime backend instead, but it has problems https://github.com/triton-inference-server/server/issues/4587 and https://github.com/microsoft/onnxruntime/issues/11356

fran6co avatar Jul 06 '22 09:07 fran6co

I think what you want is already implemented : https://github.com/triton-inference-server/server/blob/main/docs/optimization.md#onnx-with-tensorrt-optimization-ort-trt

joihn avatar Jul 06 '22 14:07 joihn

@joihn ORT is short for onnxruntime, what I'm asking is to use tensorrt with onnxs but without onnxruntime

fran6co avatar Jul 06 '22 14:07 fran6co

+1

yaysummeriscoming avatar Aug 26 '22 16:08 yaysummeriscoming

@tanmayv25 thoughts?

jbkyang-nvi avatar Nov 22 '22 03:11 jbkyang-nvi

I would not like to complicate TensorRT backend to consume onnx files and own the conversion within TensorRT backend. ORT already support TRT execution provider.

tanmayv25 avatar Dec 14 '22 15:12 tanmayv25

Closing this issue due to lack of activity. If this issue needs follow-up, please let us know and we can reopen it for you.

jbkyang-nvi avatar Jan 28 '23 00:01 jbkyang-nvi