Dmitrii Aksenov
Results
1
issues of
Dmitrii Aksenov
I have quantized the 13B model to 2bit by executing: `python -m llama.llama_quant decapoda-research/llama-13b-hf c4 --wbits 2 --save pyllama-13B2b.pt` After the quantization when I run the test inference the output...