exllama issues

Occasionally RuntimeError

Traceback (most recent call last): File "/home/ubuntu/text-generation-webui/modules/callbacks.py", line 56, in gentask ret = self.mfunc(callback=_callback, *args, **self.kwargs) File "/home/ubuntu/text-generation-webui/modules/text_generation.py", line 361, in generate_with_callback shared.model.generate(**kwargs) File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return...

leegohi04517

Using Exllama backend requires all the modules to be on GPU - how?

1

I'm sorry I am unable to find relevant doc on Internet on how to load all modules on GPU. I got this error message from my code: ``` Found modules...

tigerinus

CodeLLaMA + LoRA: RuntimeError: CUDA error: an illegal memory access was encountered

3

I am getting this error when trying to do inference with CodeLLaMA34B from The-Bloke + a LoRA trained on the same model using alpaca_lora_4bit. Commenting out the generator.lora line works....

juanps90

Support for Baichuan2 models

1

Baichuan2 is a new model tha has better overall results compared to the llama. `https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat` It works with AutoGPTQ, but I encountered some error with exllama. ![3997267f0ea6fc9c7992296e90da7bf5](https://github.com/turboderp/exllama/assets/397630/4a49b9ff-2b03-442c-bd05-da9d376e735e)

bernardx

Llama 2 Chat implementation

10

Hey! This is the correct LLama 2 Chat prompt formatting implementation into `example_llama2chat.py`. This PR uses https://github.com/turboderp/exllama/pull/195 to copy the exact implementation of the [original Llama repo](https://github.com/facebookresearch/llama/) The format for...

SinanAkkoyun

OSError: CUDA_HOME environment variable is not set.

8

I am getting this on my Mac M1 (Ventura 13.5.2) with Python 3.11.5: ``` Traceback (most recent call last): File "/Users/user/code/project/text-generation-webui/server.py", line 29, in from modules import ( File "/Users/user/code/project/text-generation-webui/modules/ui_default.py",...

jamesbraza

Changing hyper-parameters after initilization without reloading weights from disk.

I'm writing a production server to handle requests from a large number of clients rotating. I have a custom manager class that handles everything, but I'm hoping to keep the...

kmccleary3301

finetuned Llama-2-7B-32K-Instruct-GPTQ only returns '\n'

Hello! Been experiencing a problem in exllama (both versions) with this particular model. The model only outputs '\n' when used with exllama. I've came across this problem two different ways:...

Napuh

followed instructions with error

2

followed instructions with error generated below: ``` python3 example_chatbot.py -d models/airoboros/model.safetensors -un "Jeff" -p prompt_chatbort.txt Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build subprocess.run( File "/usr/lib/python3.10/subprocess.py", line...

hiqsociety

Tried to build setup exllama but encountering ninja related errors, can someone please help me?

3

Hello everyone Im trying to setup exllama in an Azure ML compute and I followed the instructions here https://github.com/turboderp/exllama, but unfortunately Im getting an error when trying to call this...

BwandoWando

exllama
exllama copied to clipboard

Metadata

Occasionally RuntimeError

Using Exllama backend requires all the modules to be on GPU - how?

CodeLLaMA + LoRA: RuntimeError: CUDA error: an illegal memory access was encountered

Support for Baichuan2 models

Llama 2 Chat implementation

OSError: CUDA_HOME environment variable is not set.

Changing hyper-parameters after initilization without reloading weights from disk.

finetuned Llama-2-7B-32K-Instruct-GPTQ only returns '\n'

followed instructions with error

Tried to build setup exllama but encountering ninja related errors, can someone please help me?

← Metadata

Owner

Metadata

exllama exllama copied to clipboard

Metadata

← Metadata

Owner

Metadata

exllama
exllama copied to clipboard