TensorRT
TensorRT copied to clipboard
Accuracy problem between onnx and fp16 trt inference
Description
I am encountering an accuracy discrepancy between ONNX inference and TensorRT FP32 inference.
Environment
TensorRT Version: 10.8.0.43
NVIDIA GPU: RTX 3060
NVIDIA Driver Version: 560.35.05
CUDA Version: 12.4.99
CUDNN Version: 9.8.0.87
Operating System: 24.04.1-Ubuntu
Python Version (if applicable): 3.11.11
PyTorch Version (if applicable): 2.6.0
Relevant Files
Model link:
https://drive.google.com/drive/folders/1OaczMFXSv2a46QZHwSsn6lT8V3S_Ki-O?usp=sharing
Steps To Reproduce
polygraphy run --onnxrt tlr_202506151003.onnx \
--data-loader-script tool/data_loader.py \
--save-outputs outputs_fp32.json
trtexec --onnx=tlr_202506151003.onnx --saveEngine=model_fp32.plan
polygraphy run --trt model_fp32.plan \
--data-loader-script tool/data_loader.py \
--load-outputs outputs_fp32.json \
--atol 0.01 --rtol 0.01
Have you tried the latest release?: not yet