OQA
OQA copied to clipboard
question about supernet's quantization function and bit inheritance
Thanks for sharing the awesome work! I have one question about quantization function and bit inheritance. In the paper, you choose LSQ as the quantization function which has a learnable scale paremeter and double the paremeter when bit-width goes down(bit inheritance). My question is: If using google's tflite int16/in8 quantization function which has a scale and a zero point, does your OQA algoritim work?If it works, how should I adjust the bit inheritance algorithm?
Thanks!
Thanks for your interest in our work.
We only implement LSQ in this framework as LSQ is the state-of-the-art quantization method. And We choose LSQ as the quantization scales need to be trained in the supernet training process.
The whole framework is orthogonal to quantization methods. And bit inheritance is also helpful in training one quantization network.
As for the details of adjusting with tflite int16/in8 quantization function, it remains unexplored. If you have some advice or practical experience, you are most welcome to contribute to our repo~
Thanks for your quick reply! Can OQA be adjusted to support mix-precision(eg. int8+int16) quantizaiton in the supernet training process?As far as I know, current OQA just support fix precision supernet in which all layers have the same bit width.
Thanks again!
The mixed-precision is not hardware friendly to be deployed, so we didn't focus on this part.
We actually already support mixed-precision(2, 3and 4bits) supernet in heavy networks like resnet34, and it could surpass the accuracy of that of training from scratch. While in compact networks, there still exist problems and we will fix them as soon as possible and release the training code when it's ready~