Badis comments

Results 53 comments of


                                            Badis

Add lora support?

> Well 4 bit by itself is deterministic. 8/fp16 was not, unless you count producing a stream of unending garbage every time as deterministic. Turning off do_sample allows 8bit to...

> Yes.. but `do_sample = False` generations are repetitive garbage and you use (NovelAI-Sphinx Moth) in your example. With randomness enabled generation parameters, you can avoid the problems like I...

Can't run the triton models with the monkey patch

I have a 3060, and weirdly it works for me on the cuda part

Can't run the triton models with the monkey patch

I still have this issue even with the latest Nvdia driver ![image](https://user-images.githubusercontent.com/110173477/234992582-bdbfde41-1cbb-420d-b210-cfaa50aaa888.png)

LLaVA support

I tried it and it doesn't look like you can talk with the model without image, you are obligated to give him one to get it going. That's a shame...

Error on the Qwen 1.5 series (llamacpp_hf)

No, Qwen have their own tokenizer unfortunately (and somehow they never upload their tokenizer.model so...). I tried with llama's tokenizer just for the sake of it and it outputs gibberish.

[Bug]: High VRAM cards still slower and moving models due to Lora networks

I second this, there should be a way to keep the model to your VRAM until you switch to another model Edit: Don't use any flags and use those settings,...

Different outputs for differents numbers of threads (same seed)

@ggerganov I just tried with t = 14, 12 and 5 and I have the same result each time. Looks like it's fixed! Great work I appreciate 😄

Add lora support?

I've got a new error somehow during the loading of the 13b lora ``` CUDA SETUP: Loading binary C:\Users\Utilisateur\anaconda3\envs\textgen\lib\site-packages\bitsandbyte s\libbitsandbytes_cuda116.dll... Adding the LoRA alpaca-lora-13b to the model... Traceback (most recent...

[WIP, broken] Importer for GPTQ quantized LLaMA models

I asked gpt4 if he saw some errors on your code, here's his answer: ``` It's possible that there could be issues with the math operations in the code that...