TensorRT RT-DETR FP16 inference get correct result on v100 but weird result on a10

RT-DETR FP16 inference get correct result on v100 but weird result on a10

Open xiaochus opened this issue 1 year ago • 2 comments

Description

I am using tritonserver:23.10 to deploy RT-DETR model. The onnxruntime fp32, onnxruntime fp16 and Tesla V100 TRT 8.6.1 FP16/F32 both get the correct result. But Tesla A10 TRT 8.6.1 get correct result in FP32 and weird result in FP16. The FP16 result should be same both in V100 and A10 with the same code.

A10 - TRT - FP16

V100 - TRT - FP16

Environment

TensorRT Version: TensoRT 8.6.1

NVIDIA GPU: Tesla V100 / A10

NVIDIA Driver Version: 515.65.01

CUDA Version: 12.2

CUDNN Version: v8

Operating System: Ubuntu 22.04

Python Version (if applicable): 3.10

Tensorflow Version (if applicable):

PyTorch Version (if applicable): 2.0.1

Baremetal or Container (if so, version): tritonserver 23.10

Relevant Files

https://github.com/lyuwenyu/RT-DETR

Steps To Reproduce

/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan --fp16

Feb 05 '24 09:02 xiaochus

Did you test metric like mAP?
Could you please share the onnx here for reproduce.

Thanks!

Feb 07 '24 09:02 zerollzeng

FP16 may introduce accuracy drop so it's hard to said whether it's bug unless we have generic metric like mAP.

Feb 07 '24 09:02 zerollzeng

closing since no activity for more than 3 weeks, thanks all!

Mar 05 '24 16:03 ttyio

https://github.com/NVIDIA/TensorRT/issues/3700

Mar 08 '24 14:03 chinakook

Please reopen this issue to track Ampere accuracy lose issue on all detr like models.

Apr 19 '24 10:04 chinakook

TensorRT TensorRT copied to clipboard

RT-DETR FP16 inference get correct result on v100 but weird result on a10

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard