ZeroQ Why do the weights need to be dequantized after one quantization?

Why do the weights need to be dequantized after one quantization?

Open lotusdaddy opened this issue 2 years ago • 1 comments

   new_quant_x = linear_quantize(x, scale, zero_point, inplace=False)
    n = 2**(k - 1)
    new_quant_x = torch.clamp(new_quant_x, -n, n - 1)
    quant_x = linear_dequantize(new_quant_x,
                                scale,
                                zero_point,
                                inplace=False)

Doesn't this get the weight of the floating point number?

Jul 12 '22 07:07 lotusdaddy

From my point of view, most of the quantization papers, the code is using fake quantization operation to simulate quantization. So it's still using floating-point numbers for quantization.

Sep 26 '22 03:09 Minato-Zackie

ZeroQ ZeroQ copied to clipboard

Why do the weights need to be dequantized after one quantization?

ZeroQ
ZeroQ copied to clipboard