nncf Symmetric Quantization formula

Symmetric Quantization formula

Open linhdb-2149 opened this issue 2 years ago • 2 comments

Are quantization formulas (specially symmetric quantization formula) the same as reported in paper? https://arxiv.org/pdf/2002.08679.pdf

Paper:

Docs: https://github.com/openvinotoolkit/nncf/blob/4c7a0045abd557af13eb9b6386f88f5d5d30a2ff/docs/compression_algorithms/Quantization.md

According to the paper, there are no rescale ($*\frac{scale}{levelhigh}$) after rounding. Furthermore, the multiplication factor of input should be reverse due to $s=\frac{q_{max} }{scale}$

Please have a look, Thank you

Jul 20 '22 04:07 linhdb-2149

I also find out the formula not consistent with nncf according to intellabs https://intellabs.github.io/distiller/algo_quantization.html

Jul 20 '22 04:07 linhdb-2149

Hi @linhdb-2149,

This is our quantization forward function for CPU. https://github.com/openvinotoolkit/nncf/blob/13cc15e6560985606abfd55086242d9f7687d6dd/nncf/torch/extensions/src/quantization/cpu/functions_cpu.cpp#L12-L19 You can see our implementation multiplying s = $\frac{q_{max}}{scale}$ ($q_{max}$ = levels - 1 and $scale$ = input_range) to the real value input tensor input. So, our real implementation are doing $s * r$ to quantize the input as following the notation above.

Thank you

Jul 20 '22 07:07 vinnamkim

nncf nncf copied to clipboard

Symmetric Quantization formula

nncf
nncf copied to clipboard