GPTQ-triton icon indicating copy to clipboard operation
GPTQ-triton copied to clipboard

question about the quantization formula

Open irasin opened this issue 1 year ago • 3 comments

the weights are decoded using the formula w = (w - z - 1) * s.

I wonder why we need to use z - 1 here since the normal quantization is w = (w - z) * s

irasin avatar May 11 '23 09:05 irasin