TensorRT
TensorRT copied to clipboard
[BUG] Resent50 model with wrong precision after quantization with tensorrt int8PTQ
Description
When using resnet50 of the timm model library, when PTQint8 is quantized, the accuracy becomes worse. In the same environment, the accuracy before and after quantization on torchvision can basically be aligned.
- First of all, what is the method used by TRT to quantify the model, and what is the implementation principle?
- I compared and analyzed the model structure of timm and torchvision and found no difference, but there are some differences in the tensor bias data corresponding to some bn layers, but I don’t think it should be the reason for the difference in quantification, right?
ref: https://github.com/rwightman/pytorch-image-models/issues/1412
Did you do the int8 calibration with real data? and how many images you are using for calibration?
Did you do the int8 calibration with real data? and how many images you are using for calibration?
I use imagenet1k_pytorch for 50000 val images. 512 images of val were randomly selected during quantization
Resnet50 accuracy is in our L1 and QA test, I will be surprised if we had accuracy issue for it. Have you checked the code that do the calibration and make sure the preprocessing of input images is correct?
@zerollzeng ,The verification script I used for my test is as follows: https://github.com/rmccorm4/tensorrt-utils/blob/master/int8/calibration/ImagenetCalibrator.py There is only so much difference in the code between the two tests, but the accuracy difference between the two models after quantization is too large timm:
net = timm.create_model('resnet50', pretrained=True) model = torch.jit.script(net).eval().cuda()
torchvision:
import torchvision.models as models mod = models.resnet50(pretrained=True).eval() mod_jit = torch.jit.script(mod) model = mod_jit.cuda()
@zerollzeng Can you provide the contact information, email or something? Contact you privately
as for BN layers, i found there exists BN layers after trt - pytorch_quantization, exported to ONNX model. but it isn't appeared through other resnet50 quantization repo
I will close this issue, now we have release https://github.com/NVIDIA/TensorRT-Model-Optimizer for quantization, thanks all!