pyllama icon indicating copy to clipboard operation
pyllama copied to clipboard

LLaMA: Open and Efficient Foundation Language Models

Results 69 pyllama issues
Sort by recently updated
recently updated
newest added

Could you put the actual text for command to run inference with Quantization? I cannot see the image because I'm blind and uses screen reader. Readme says "With quantization, you...

I have quantized the 13B model to 2bit by executing: `python -m llama.llama_quant decapoda-research/llama-13b-hf c4 --wbits 2 --save pyllama-13B2b.pt` After the quantization when I run the test inference the output...

pyllama-7B2B.pt download link https://pan.baidu.com/s/1zOdKOHnSCsz6TFix2NTFtg Tries to get me to download an executable called BaiduNetdisk_7.26.0.10.exe

I was able to convert LLaMA weights, quantize, and inference using qwopqwop200/GPTQ-for-LLaMa. However, I can't load it using Pyllama. Thanks! 1. Clone: `git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda` 2. Install: `pip...

Is there a way to skip evaluation after quantizing? It just takes forever on Colab!

Hi, I'm interested to try running llama models, I'm using a macbook with AMD GPU, so probably easiest would be to use CPU. Would be nice to know if it's...

It's just so convenient to be able to auto-format code.

Has anyone got this dockerized for easy install?