TensorRT Accuracy problem between onnx and fp16 trt inference

Accuracy problem between onnx and fp16 trt inference

Open KexianShen opened this issue 4 months ago • 3 comments

Description

I am encountering an accuracy discrepancy between ONNX inference and TensorRT FP32 inference.

Environment

TensorRT Version: 10.8.0.43

NVIDIA GPU: RTX 3060

NVIDIA Driver Version: 560.35.05

CUDA Version: 12.4.99

CUDNN Version: 9.8.0.87

Operating System: 24.04.1-Ubuntu

Python Version (if applicable): 3.11.11

PyTorch Version (if applicable): 2.6.0

Relevant Files

Model link:

https://drive.google.com/drive/folders/1OaczMFXSv2a46QZHwSsn6lT8V3S_Ki-O?usp=sharing

Steps To Reproduce

polygraphy run --onnxrt tlr_202506151003.onnx \
    --data-loader-script tool/data_loader.py \
    --save-outputs outputs_fp32.json

trtexec --onnx=tlr_202506151003.onnx --saveEngine=model_fp32.plan

polygraphy run --trt model_fp32.plan \
    --data-loader-script tool/data_loader.py \
    --load-outputs outputs_fp32.json \
    --atol 0.01 --rtol 0.01

Have you tried the latest release?: not yet

Jun 18 '25 03:06 KexianShen

TensorRT TensorRT copied to clipboard

Accuracy problem between onnx and fp16 trt inference

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard