exllama issues

Question about storing models in Container

2

Hi, Sorry if this is obvious : ) but, I'm trying to build the Docker container. It says to "First, set the `MODEL_PATH` and `SESSIONS_PATH` variables in the `.env` file...

JacobGoldenArt

[Bug]: Sampling fails when temperature is 0

4

This line in `generator.py` yields infinite logits when temperature is set to 0: https://github.com/turboderp/exllama/blob/c16cf49c3f19e887da31d671a713619c8626484e/generator.py#L106C1-L106C30 ![image](https://github.com/turboderp/exllama/assets/44957968/ac3cb537-f766-4939-97a8-4fb48b6fa0ce) Debugger result: ![image](https://github.com/turboderp/exllama/assets/44957968/6a0de06e-a4de-4e18-bea1-d7dddbfd4010)

kogolobo

Please handle the case your logits contain nans

1

![image](https://github.com/turboderp/exllama/assets/827993/29b43eec-bdb0-4511-8fbf-5f233ac5ee10) Hi, I was prompting llama-2-7B and got into this error. Can you please handle the case there are nans in the logits?

ParisNeo

ImportError: /home/myuser/.cache/torch_extensions/py310_cpu/exllama_ext/exllama_ext.so: undefined symbol: hipblasGetStream

26

I am using exllama through the oobabooga text-generation-webui with AMD/ROCm. I cloned exllama into the text-generation-webui/repositories folder and installed dependencies. Devices: 2x AMD Instinct MI60 gfx906 Distro: Ubuntu 20.04.6 Kernel:...

sjstulga

Slowdown again with pascal cards.

5

I couldn't reopen my original issue so I hope its fine if I open another bug. The pascal fix is broken again, at least for me. The following check does...

ENjoyBlue2021

Increased context length with NTK Rope Scaling

13

I am having bad quality results with prompts longer than 2048 tokens with a LoRA trained with alpaca_lora_4bit. These are the settings I am using: ``` config = ExLlamaConfig(model_config_path) #...

juanps90

Dynamic NTK RoPe scaling support

1

transformers merged https://github.com/huggingface/transformers/pull/24653 only dynamic NTK RoPe scaling (NTKv2), would be nice to have it in exllama.

epicfilemcnulty

Add truncation warning

2

Add truncation warning, as it can be kind of rough to find this through trial and error or by noticing the context numbers.

vadi2

Beating the (probably dead) Metal horse

1

bleedingedgedebian

exllama
exllama copied to clipboard

Metadata

Question about storing models in Container

[Bug]: Sampling fails when temperature is 0

Please handle the case your logits contain nans

ImportError: /home/myuser/.cache/torch_extensions/py310_cpu/exllama_ext/exllama_ext.so: undefined symbol: hipblasGetStream

Slowdown again with pascal cards.

Increased context length with NTK Rope Scaling

Dynamic NTK RoPe scaling support

Add truncation warning

Beating the (probably dead) Metal horse

← Metadata

Owner

Metadata

exllama exllama copied to clipboard

Metadata

← Metadata

Owner

Metadata

exllama
exllama copied to clipboard