exllama
exllama copied to clipboard
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
Traceback (most recent call last): File "/home/ubuntu/text-generation-webui/modules/callbacks.py", line 56, in gentask ret = self.mfunc(callback=_callback, *args, **self.kwargs) File "/home/ubuntu/text-generation-webui/modules/text_generation.py", line 361, in generate_with_callback shared.model.generate(**kwargs) File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return...
I'm sorry I am unable to find relevant doc on Internet on how to load all modules on GPU. I got this error message from my code: ``` Found modules...
I am getting this error when trying to do inference with CodeLLaMA34B from The-Bloke + a LoRA trained on the same model using alpaca_lora_4bit. Commenting out the generator.lora line works....
Baichuan2 is a new model tha has better overall results compared to the llama. `https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat` It works with AutoGPTQ, but I encountered some error with exllama. 
Hey! This is the correct LLama 2 Chat prompt formatting implementation into `example_llama2chat.py`. This PR uses https://github.com/turboderp/exllama/pull/195 to copy the exact implementation of the [original Llama repo](https://github.com/facebookresearch/llama/) The format for...
I am getting this on my Mac M1 (Ventura 13.5.2) with Python 3.11.5: ``` Traceback (most recent call last): File "/Users/user/code/project/text-generation-webui/server.py", line 29, in from modules import ( File "/Users/user/code/project/text-generation-webui/modules/ui_default.py",...
I'm writing a production server to handle requests from a large number of clients rotating. I have a custom manager class that handles everything, but I'm hoping to keep the...
Hello! Been experiencing a problem in exllama (both versions) with this particular model. The model only outputs '\n' when used with exllama. I've came across this problem two different ways:...
followed instructions with error generated below: ``` python3 example_chatbot.py -d models/airoboros/model.safetensors -un "Jeff" -p prompt_chatbort.txt Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build subprocess.run( File "/usr/lib/python3.10/subprocess.py", line...
Hello everyone Im trying to setup exllama in an Azure ML compute and I followed the instructions here https://github.com/turboderp/exllama, but unfortunately Im getting an error when trying to call this...