brevitas icon indicating copy to clipboard operation
brevitas copied to clipboard

Improvement of equalization method.

Open ZhangZhiPku opened this issue 2 years ago • 1 comments

Hi, developers of brevitas. I once worked with vitisAI team in Xilinx for serval months (2020~2021, internship), and now I still work for creating better network quantization tools in Sensetime.

In 2020, we found some equalization tricks that can greatly enhance the performance(accuracy) of quantized network. As we all know, Xilinx FPGA requires bias to be quantized with 8 bit-width, 8-bit bias is a very distinctive and restrictive policy in network quantization. To meet this requirement, we have to control and limit the range of bias, especially for algorithm like equalization, as it will apply a scale factor on both weight and bias. There are some scale factors can reduce both weight and bias range, however if we don't take bias into consideration at first, scale factor solved by torch.sqrt(srcs_range / sinks_range) might lead to an extremely large bias.

So here is our solution: when calculating the scale factor of layers, we first concatenate weight and bias into a single tensor, then use the generated tensor to compute the scale factor instead. Scale factor generated this way will be affected by the value of bias and balance weight and bias together. In our Expriments, this little trick serves as a crucial improvement in vitisAI, would you like to introduce this feature into brevitas in the future?

refer to: https://github.com/Xilinx/Vitis-AI/blob/master/tools/Vitis-AI-Quantizer/vai_q_pytorch/nndct_shared/optimization/commander.py

refer to https://github.com/openppl-public/ppq/blob/master/ppq/quantization/algorithm/equalization.py

ZhangZhiPku avatar May 18 '22 17:05 ZhangZhiPku

HI @ZhangZhiPku,

Thanks for the suggestion. I experimented with the strategy you are describing in the past, but at the time I didn't notice much of a difference with the models I was working with, probably because they had higher precision integer bias (16b or 32b). I should probably revisit it and incorporate it.

volcacius avatar Jun 10 '22 00:06 volcacius

#534 implemented this method, closing this issue.

Thanks for your contribution!

Giuseppe5 avatar Mar 03 '23 11:03 Giuseppe5