TensorRT pytorch_quantization qat accuracy descend

Description

the model after pytorch_quantization qat, accuracy descend relative to before pytorch_quantization qat

Environment

TensorRT Version: 8.5.3.1

NVIDIA GPU: TITIAN Xp

NVIDIA Driver Version:450.80.02

CUDA Version:11.0

CUDNN Version:8.6.0

Operating System:Ubuntu 18.04.5 LTS

Python Version (if applicable):3.9.16

Tensorflow Version (if applicable):

PyTorch Version (if applicable):1.12.1+cu102

Baremetal or Container (if so, version):

Relevant Files

Model link: code.zip

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Aug 21 '24 07:08 svitlana1ana

You only do caib, if calib is not meet the expectations, you need finetune.

Aug 24 '24 00:08 lix19937

calib is not meet the expectations, so i do QAT train my QAT code as the file below: qat code.zip

Am I writing this correctly， or i should calib and the pth, then load pth and do QAT train, Divide the operation into two steps instead of one

Aug 26 '24 02:08 svitlana1ana

Is this a bug in Tensorrt, or is the accuracy poor after QAT even with torch inference? You can try using ModelOpt package for calibration if it fits your use case.

Aug 30 '24 21:08 akhilg-nv

@steven-spec as per our policy, I am going to close this issue as it's older than 21 days. If you'd like to follow up, please open another issue, thank you.

Sep 21 '24 01:09 moraxu