output result error of TensorRT 10.0.1.6 when running conv+clip structure on GPU NVIDIA L4
Description
My model encountered a result error when using tensorrt acceleration. The positioning found that it was a calculation error of the conv+clip graph structure. I created a small model with only a simple conv and a clip operator that could reproduce this problem。
Environment
**TensorRT Version10.0.1.6:
**NVIDIA GPUNVIDIA L4:
**NVIDIA Driver Version535.129.03:
CUDA Version 11.8:
**CUDNN Versionlibcudnn.so.8.9.6:
Operating System: Python Version (if applicable): 3.10.13
PyTorch Version (if applicable):2.1.2
Relevant Files
Steps To Reproduce
- Compile onnx models to trt eigine
- Load an image, adjust it to a tensor of (1,3,512,512), and normalize it to (-0.5,0.5) as model input
- run in trt
- get trt output and +0.5 to normalize to (0,1)
- save output image
Ple use polygraphy to compare fp32 /fp16 with onnxruntime.
@yikox have you tried disable TF32?
export NVIDIA_TF32_OVERRIDE=0
https://deeprec.readthedocs.io/en/latest/NVIDIA-TF32.html
thanks!
@yikox , I will be closing this ticket due to our policy to close tickets with no activity for more than 21 days after a reply had been posted. Please reopen a new ticket if you still need help.