TensorRT ✨[Feature] Weight specific engine caching

✨[Feature] Weight specific engine caching

Open narendasan opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe.

Caching right now is weight agnostic, but at the cost of creating lower performance engines.

Describe the solution you'd like

If we know that weights would be identical, then we can cache engines that are higher performance. The caching system would need to be able to distinguish these two caches and based on user settings select the right one

TensorRT has a flag called kREFIT_IDENTICAL for this workflow

Describe alternatives you've considered

Additional context

Sep 04 '24 17:09 narendasan

TensorRT TensorRT copied to clipboard

✨[Feature] Weight specific engine caching

TensorRT
TensorRT copied to clipboard