AutoAWQ icon indicating copy to clipboard operation
AutoAWQ copied to clipboard

question about s^-1 * x in code 313

Open xiaoxiangshusheng opened this issue 11 months ago • 3 comments

image In this code, author thinks s^-1 * x is fused here, according to paper, but below x is equal to input x, in other words, could you help me to explain where to multiple s^-1 for x, after all, the equation(3) thinks that s^-1 is essential to mutiple x. Thanks for your reply!

xiaoxiangshusheng avatar Feb 29 '24 09:02 xiaoxiangshusheng

s^-1 * x is not explicitly computed, it is fused according to authors. I added the comment in quantization code because I understand it as being fused during the creation of scales due to x_max.pow(ratio). See references 44, 46 in paper.

image

casper-hansen avatar Feb 29 '24 09:02 casper-hansen

Thank you for your patience to reply me. I have understand s^-1 * x now. Besides, in code 325 , "fc.weight.data)[0] / scales_view" , it seems that this operation does not work .

xiaoxiangshusheng avatar Feb 29 '24 09:02 xiaoxiangshusheng

I came here with the same question and finally got it.

The key point why there is no explicitly computing s^-1 * x is at line 334-336

https://github.com/casper-hansen/AutoAWQ/blob/c53cc7e8cf65747bab526a3c9e9ee37e580b8c39/awq/quantize/quantizer.py#L331-L336

wo-quant: Q(W) @ X

wo-quant with awq: Instead of Q(W*s) @ (s^-1 * X), the code fused s^-1 into the weight: (Q(W*s) * s^-1) @ X

Hope that this could save some time for the next guy coming here. :)

siahuat0727 avatar Jun 26 '24 08:06 siahuat0727