GPTQ-for-LLaMa icon indicating copy to clipboard operation
GPTQ-for-LLaMa copied to clipboard

Total parameters are less after quantization

Open ZN1010 opened this issue 10 months ago • 1 comments

After quantization of LLaMA2-7b, I notice that total parameters of the quantized model is around 1.1B while the original dense model has around 6.7B parameters. It seems that the code also prunes LLM weights. Any idea why weights are additionally removed?

Thanks a lot!

ZN1010 avatar Dec 21 '24 05:12 ZN1010