TensorRT output result error of TensorRT 10.0.1.6 when running conv+clip structure on GPU NVIDIA L4

Description

My model encountered a result error when using tensorrt acceleration. The positioning found that it was a calculation error of the conv+clip graph structure. I created a small model with only a simple conv and a clip operator that could reproduce this problem。

Environment

**TensorRT Version10.0.1.6:

**NVIDIA GPUNVIDIA L4:

**NVIDIA Driver Version535.129.03:

CUDA Version 11.8:

**CUDNN Versionlibcudnn.so.8.9.6:

Operating System: Python Version (if applicable): 3.10.13

PyTorch Version (if applicable):2.1.2

Relevant Files

tensorrt ConvClip.onnx.zip

Steps To Reproduce

Compile onnx models to trt eigine
Load an image, adjust it to a tensor of (1,3,512,512), and normalize it to (-0.5,0.5) as model input
run in trt
get trt output and +0.5 to normalize to (0,1)
save output image

Jun 12 '24 07:06 yikox

Ple use polygraphy to compare fp32 /fp16 with onnxruntime.

Jun 13 '24 08:06 lix19937

@yikox have you tried disable TF32?

       export NVIDIA_TF32_OVERRIDE=0

https://deeprec.readthedocs.io/en/latest/NVIDIA-TF32.html

thanks!

Aug 07 '24 05:08 ttyio

@yikox , I will be closing this ticket due to our policy to close tickets with no activity for more than 21 days after a reply had been posted. Please reopen a new ticket if you still need help.

Sep 07 '24 01:09 moraxu