pyllama evaluating has an extremely large value when quantize to 4bit.

evaluating has an extremely large value when quantize to 4bit.

Open JiachuanDENG opened this issue 1 year ago • 1 comments

I followed the steps try to get 4bit version of llama7b by using command python -m llama.llama_quant decapoda-research/llama-7b-hf c4 --wbits 4 --groupsize 128 --save pyllama-7B4b.pt, the script works well, but at the evaluating stage, it got a very large number 251086.96875.

Screenshot2023_06_07_134554

And when I testing with the quantized .pt file, model returns un-readable results.

Screenshot 2023-06-07 at 13 50 05

Anyone has same problem?

Jun 07 '23 05:06 JiachuanDENG

pyllama pyllama copied to clipboard

evaluating has an extremely large value when quantize to 4bit.

pyllama
pyllama copied to clipboard