CalibTIP
CalibTIP copied to clipboard
Quantized model accuracy
Good afternoon,
I found your paper very interesting and wanted to try out your code. I have a several questions I would be grateful if you could answer:
- In your paper you use a 71.97% benchmark for FP32 ResNet18 model. Is this your own trained checkpoint? The torchvision model zoo provides 69.76 top-1 accuracy.
- I am trying to measure accuracy of a quantized model(both weights and activations). The model saved after, for example, bn_tuning, contains weights that have attributes: quantize_weight.running_range, quantize_weight.running_zero_point, quantize_input.running_range, quantize_input.running_zero_point, beta, gamma, equ_scale, num_measured.
- For activations, naturally, I placed forward hooks that quantize inputs to layers using quantize_input.running_zero_point and quantize_input.running_range. However, there is no range and zero point for quantizing inputs to residual connection(where residuals are summed with outputs from the previous layer). Did you not quantize those inputs when measuring accuracy?
- Do I understand correctly, that the output model already contains quantized weights or do I have to quantize them myself using quantize_weight.running_range, quantize_weight.running_zero_point, gamma and beta attributes?
- Maybe you already have a script for measuring accuracy of a quantized model and I just missed it?
- What is the purpose of equ_scale attribute? In all of my tests on ResNet18 model they only contained ones(tensor of 1s).
- I used pytorchcv models
- I quantize both weights and activation before I run the nn.conv see Qconv_2d implementation
- As part of the main script you run evaluation post adaquant quantization
- I tried to simply learn the scaling for equalization using equ_scale variable but it did not work that well (stuck on 1) - debugging it might improve results