onnxruntime_backend Allow to specify tensorrt cache path per version

Allow to specify tensorrt cache path per version

Open fran6co opened this issue 3 years ago • 0 comments

trafficstars

When using onnx with tensorrt it saves a lot of time to use the tensorrt cache path. The drawback is that onnxruntime is not smart enough to avoid using the same cache if the model is different or tensorrt version changed causing a lot of errors.

It would be great if it could generate a tensorrt cache path per version of the model, that would solve at least generating wrong outputs when changing model version. If the path could contain GPU model and tensorrt version that would solve the other case as well, but I think that's a less problem as it's acceptable to clear the cache when deploying new versions.

The Warmup feature solves all this issues but it comes at the cost of very slow startup, some models can take minutes to generate the tensorrt plan.

Jul 04 '22 14:07 fran6co

onnxruntime_backend onnxruntime_backend copied to clipboard

Allow to specify tensorrt cache path per version

onnxruntime_backend
onnxruntime_backend copied to clipboard