TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

RT-DETR FP16 inference get correct result on v100 but weird result on a10

Open xiaochus opened this issue 1 year ago • 2 comments

Description

I am using tritonserver:23.10 to deploy RT-DETR model. The onnxruntime fp32, onnxruntime fp16 and Tesla V100 TRT 8.6.1 FP16/F32 both get the correct result. But Tesla A10 TRT 8.6.1 get correct result in FP32 and weird result in FP16. The FP16 result should be same both in V100 and A10 with the same code.

A10 - TRT - FP16 image

V100 - TRT - FP16 image

Environment

TensorRT Version: TensoRT 8.6.1

NVIDIA GPU: Tesla V100 / A10

NVIDIA Driver Version: 515.65.01

CUDA Version: 12.2

CUDNN Version: v8

Operating System: Ubuntu 22.04

Python Version (if applicable): 3.10

Tensorflow Version (if applicable):

PyTorch Version (if applicable): 2.0.1

Baremetal or Container (if so, version): tritonserver 23.10

Relevant Files

https://github.com/lyuwenyu/RT-DETR

Steps To Reproduce

/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan --fp16

xiaochus avatar Feb 05 '24 09:02 xiaochus

  1. Did you test metric like mAP?
  2. Could you please share the onnx here for reproduce.

Thanks!

zerollzeng avatar Feb 07 '24 09:02 zerollzeng

FP16 may introduce accuracy drop so it's hard to said whether it's bug unless we have generic metric like mAP.

zerollzeng avatar Feb 07 '24 09:02 zerollzeng

closing since no activity for more than 3 weeks, thanks all!

ttyio avatar Mar 05 '24 16:03 ttyio

https://github.com/NVIDIA/TensorRT/issues/3700

chinakook avatar Mar 08 '24 14:03 chinakook

Please reopen this issue to track Ampere accuracy lose issue on all detr like models.

chinakook avatar Apr 19 '24 10:04 chinakook