AutoAWQ
AutoAWQ copied to clipboard
question about s^-1 * x in code 313
In this code, author thinks s^-1 * x is fused here, according to paper, but below x is equal to input x, in other words, could you help me to explain where to multiple s^-1 for x, after all, the equation(3) thinks that s^-1 is essential to mutiple x.
Thanks for your reply!
s^-1 * x
is not explicitly computed, it is fused according to authors. I added the comment in quantization code because I understand it as being fused during the creation of scales due to x_max.pow(ratio)
. See references 44, 46 in paper.
Thank you for your patience to reply me. I have understand s^-1 * x now. Besides, in code 325 , "fc.weight.data)[0] / scales_view" , it seems that this operation does not work .
I came here with the same question and finally got it.
The key point why there is no explicitly computing s^-1 * x is at line 334-336
https://github.com/casper-hansen/AutoAWQ/blob/c53cc7e8cf65747bab526a3c9e9ee37e580b8c39/awq/quantize/quantizer.py#L331-L336
wo-quant: Q(W) @ X
wo-quant with awq: Instead of Q(W*s) @ (s^-1 * X)
, the code fused s^-1 into the weight: (Q(W*s) * s^-1) @ X
Hope that this could save some time for the next guy coming here. :)