nncf
nncf copied to clipboard
Symmetric Quantization formula
Are quantization formulas (specially symmetric quantization formula) the same as reported in paper? https://arxiv.org/pdf/2002.08679.pdf
Paper:
Docs:
https://github.com/openvinotoolkit/nncf/blob/4c7a0045abd557af13eb9b6386f88f5d5d30a2ff/docs/compression_algorithms/Quantization.md
According to the paper, there are no rescale ($*\frac{scale}{levelhigh}$) after rounding. Furthermore, the multiplication factor of input should be reverse due to $s=\frac{q_{max} }{scale}$
Please have a look, Thank you
I also find out the formula not consistent with nncf according to intellabs https://intellabs.github.io/distiller/algo_quantization.html
Hi @linhdb-2149,
This is our quantization forward function for CPU.
https://github.com/openvinotoolkit/nncf/blob/13cc15e6560985606abfd55086242d9f7687d6dd/nncf/torch/extensions/src/quantization/cpu/functions_cpu.cpp#L12-L19
You can see our implementation multiplying s
= $\frac{q_{max}}{scale}$ ($q_{max}$ = levels - 1
and $scale$ = input_range
) to the real value input tensor input
. So, our real implementation are doing $s * r$ to quantize the input as following the notation above.
Thank you