Dmitrii Aksenov

Results 1 issues of Dmitrii Aksenov

I have quantized the 13B model to 2bit by executing: `python -m llama.llama_quant decapoda-research/llama-13b-hf c4 --wbits 2 --save pyllama-13B2b.pt` After the quantization when I run the test inference the output...