onnxruntime
onnxruntime copied to clipboard
[TensorRT EP] Support engine hardware compatibility
Description
- Introduce option
trt_engine_hw_compatible
to support engine hardware compatibility for Ampere+ GPUs- This enables
nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS
flag when generating engines - This option has been validated on sm80/86 GPUs, as engine can be reused across different ampere+ arch:
- Client side need to enable this option as well to leverage existing sm80+ engines
- If this option is enabled by users which TRT<8.6 or sm<80, there will be a warning showing this option not supported
- This enables
Engine naming:
When | trt_engine_hw_compat=false |
trt_engine_hw_compat=true |
---|---|---|
A100 (sm80) | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80.engine | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine |
RTX3080 (sm86) | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm86.engine | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine |
Motivation and Context
Reference: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#hardware-compat