llama.cpp issues

Speed differences between vicuna models and the others.

4

When operating various 7B models (win10, Core I5, GCC64, 8GB, 4 threads) with the same program (relatively indifferent compared between the recent revisions) I found the ggml-vicuna-7b-4bit-rev1.bin and ggml-vicuna-7b-4bit.bin much...

wro52

readme : update gpt4all instructions

14

fixes https://github.com/ggerganov/llama.cpp/issues/975

prusnak

Add LoRA support

44

This change allows applying LoRA adapters on the fly without having to duplicate the model files. Instructions: - Obtain the HF PEFT LoRA files `adapter_config.json` and `adapter_model.bin` of a LoRA...

slaren

research 🔬

Running a Vicuna-13B 4it model ?

25

I found this model : [[ggml-vicuna-13b-4bit](https://huggingface.co/eachadea/ggml-vicuna-13b-4bit)](https://huggingface.co/eachadea/ggml-vicuna-13b-4bit/tree/main) and judging by their online demo it's very impressive. I tried to run it with llama.cpp latest version - the model loads fine, but...

manageseverin

model

generation quality

Add detection code for avx/avx2/etc

7

Add the code to check the build host to determine the right CPU feature. This is convenient when build Windows version on the machine without AVX2.

howard0su

Fix potential int8_t overflow in non-SIMD vec_dot

1

As rightly pointed out by @jxy [here](https://github.com/ggerganov/llama.cpp/commit/6232f2d7fd7a22d5eeb62182b2f21fcf01359754#commitcomment-108812025), my changes in #703 limiting the calculation to `int8_t` might overflow. -> Change the types to `int` instead.

sw

Add a param to force the [end of text] to show, even in interactive mode

1

Is possible to add a param to allow force show the [end of text] token? like this(i think, don't understand C/C++) ```js if (!embd.empty() && embd.back() == llama_token_eos()) { if...

jeffersoncgo

How do i stop the ai from talking to itself??

5

It sometimes just talks to itself for example: ###Human: Hi ###Assistant: Hello, how can i assist you? (i am runing the latest release with vicuna mode)

Asory2010

nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py

prusnak

GPT4All Instructions are missing both scripts: convert and migrate

1

Neither of these links works and the files aren't present anymore. > You have to convert it to the new format using [./convert-gpt4all-to-ggml.py](https://github.com/ggerganov/llama.cpp/blob/master/convert-gpt4all-to-ggml.py). You may also need to convert the...

robbintt

llama.cpp
llama.cpp copied to clipboard

Metadata

Speed differences between vicuna models and the others.

readme : update gpt4all instructions

Add LoRA support

Running a Vicuna-13B 4it model ?

Add detection code for avx/avx2/etc

Fix potential int8_t overflow in non-SIMD vec_dot

Add a param to force the [end of text] to show, even in interactive mode

How do i stop the ai from talking to itself??

nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py

GPT4All Instructions are missing both scripts: convert and migrate

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard