TensorRT The performance comparison of the resent18 model using PTQ int8 quantization(FP16 VS INT8)

The performance comparison of the resent18 model using PTQ int8 quantization(FP16 VS INT8)

Open lixiaolx opened this issue 3 years ago • 0 comments

Description

After I used onnx-tensorrt to complete the int8 quantization of the resnet18 model, I found that the performance was the same as that of fp16 (batchsize=64). I would like to ask if this is reasonable? Have you compared the quantized performance of resnet18 int8 with the performance of fp16, do you have any data? Can you help to confirm that the performance of fp16 and int8 is equal and reasonable?

Aug 12 '22 10:08 lixiaolx

TensorRT TensorRT copied to clipboard

The performance comparison of the resent18 model using PTQ int8 quantization(FP16 VS INT8)

Description

TensorRT
TensorRT copied to clipboard