llm-awq Qustion about Auto Scale Process

Qustion about Auto Scale Process

Open rainyBJ opened this issue 2 years ago • 3 comments

In auto_scale.py, I find that awq no longer consider weight_scale when finding proper scales, is it because by only considering act_scale, we can still get similar results as considering both of them?
When finding the best scales, we ''use (org_out - out).float().pow(2).mean().item()'' as metric. But when we calculate the ''out'' result after rescaling, should we divide the input x by scales, to align the apply_scale process?

Nov 03 '23 07:11 rainyBJ

Hi,

We found that adding in the weight scale does not improve the performance, so removed it for simplicity.
Yes, but we applied the division to the weight for the ease of implementation (see here https://github.com/mit-han-lab/llm-awq/blob/f0b4b68004f76d562658143cddea5aad8c1b8266/awq/quantize/auto_scale.py#L128)

Hope it addresses your questions!

Nov 04 '23 04:11 tonylins

Thanks for your reply! Your answer completely solves my problem.

Nov 05 '23 09:11 rainyBJ

Hello, I still have questions here. Why not use this Quantization first and then division?

fc.weight.data = w_quantize_func(fc.weight.data) / scales.view(1, -1)

Hi,

We found that adding in the weight scale does not improve the performance, so removed it for simplicity.

Yes, but we applied the division to the weight for the ease of implementation (see here https://github.com/mit-han-lab/llm-awq/blob/f0b4b68004f76d562658143cddea5aad8c1b8266/awq/quantize/auto_scale.py#L128 )

Hope it addresses your questions!

Hello, I still have questions here. Why not use this Quantization first and then division?

fc.weight.data = w_quantize_func(fc.weight.data) / scales.view(1, -1)

Apr 17 '24 08:04 songh11

llm-awq llm-awq copied to clipboard

Qustion about Auto Scale Process

llm-awq
llm-awq copied to clipboard