GPTQ-triton
GPTQ-triton copied to clipboard
question about the quantization formula
the weights are decoded using the formula
w = (w - z - 1) * s
.
I wonder why we need to use z - 1 here since the normal quantization is w = (w - z) * s