onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

[TensorRT EP] Support engine hardware compatibility

Open yf711 opened this issue 9 months ago • 0 comments

Description

  • Introduce option trt_engine_hw_compatible to support engine hardware compatibility for Ampere+ GPUs
    • This enables nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS flag when generating engines
    • This option has been validated on sm80/86 GPUs, as engine can be reused across different ampere+ arch:
      • Client side need to enable this option as well to leverage existing sm80+ engines
    • If this option is enabled by users which TRT<8.6 or sm<80, there will be a warning showing this option not supported

Engine naming:

When trt_engine_hw_compat=false trt_engine_hw_compat=true
A100 (sm80) TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80.engine TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine
RTX3080 (sm86) TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm86.engine TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine

Motivation and Context

Reference: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#hardware-compat

yf711 avatar May 13 '24 17:05 yf711