TensorRT Should I use pytorch-quantization or not?

Description

I have trained a model by using pytorch and export an onnx model. Now, I want to run it on TensorRT with fp16. Should I use the pytorch-quantization before I use the TensorRT, or TensorRT will automatically quantized the model when I using fp16? If TensorRT will automatically quantized the model what is pytorch-quantization this tool's uses?

Environment

TensorRT Version: NVIDIA GPU: NVIDIA Driver Version: CUDA Version: CUDNN Version: Operating System: Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version):

Relevant Files

Steps To Reproduce

Aug 08 '22 10:08 jackgao0323

pytorch-quantization tool is used for INT8 QAT, you don't need this tool if you just want to use FP16, TRT will handle the FP32->FP16 conversion automatically.

Aug 08 '22 13:08 zerollzeng

pytorch-quantization tool is used for INT8 QAT, you don't need this tool if you just want to use FP16, TRT will handle the FP32->FP16 conversion automatically.

If I use INT8 and using the calibration on TensorRT, I still need to use this tool?

Aug 08 '22 13:08 jackgao0323

Please refer to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#working-with-int8

Aug 09 '22 08:08 zerollzeng

Please refer to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#working-with-int8

Thanks a lot. I'm using PTQ now. I think PTQ and QAT are similar right?🤔 Do you have any statistics which one is better ( like which one's mAP is higher) ? If you don't have, it's all right. I'll try both of them in the future.

Aug 09 '22 09:08 jackgao0323

Try PTQ first, if PTQ doesn't satisfy the accuracy requirement. then you can try QAT them.

Aug 10 '22 14:08 zerollzeng

closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!

Dec 06 '22 02:12 ttyio

TensorRT TensorRT copied to clipboard

Should I use pytorch-quantization or not?

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard