llama.cpp issues

Add chatLLaMa script

15

I'm not sure if this has a place in the repository. I did a bit of prompt engineering to get a conversation going with LLaMa, this is the script I...

j3k0

enhancement

Reducing the time needed to reload a piece of text into the model by caching the state

7

Hey! Is it possible to add a way of dumping the current state into a file, so it can then be reloaded later? This would avoid the time needed to...

niansa

enhancement

Supported context window length for each model?

12

what's the supported context window length for each model?

bryanhpchiang

model

generation quality

Add precompiled files for windows x64 and python script instead of bash script.

5

Precompiled files for windows x64 via Cmake.

Black-Engineer

im running on bare metal nothing emulated ``` littlemac@littlemac:~$` git clone https://github.com/ggerganov/llama.cpp Cloning into 'llama.cpp'... remote: Enumerating objects: 283, done. remote: Counting objects: 100% (283/283), done. remote: Compressing objects: 100%...

Littlemac123

duplicate

hardware

build

Error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’ on x86_64 - better support for different x86_64 CPU instruction extensions

28

When I compile with make, the following error occurs ``` inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) ``` Error will be...

xiliuya

bug

performance

hardware

build

Q4_1 acceleration

3

Includes vectorised inference code, quantisation and a counterpart to the Q4_0 multipart fix we introduced a while ago. Tested working up to 13B, though I can't confidently say anything about...

blackhole89

Use RMSNorm

16

The original paper, and the reference implementation [1] uses RMS norm. However, llama.cpp uses ggml_norm() which looks like Layer norm? The differences between these may not be too obvious, because...

hoangmit

bug

help wanted

good first issue

high priority

Add disk space requirements to README.md

2

Add disk space requirements from https://cocktailpeanut.github.io/dalai/#/?id=_7b, as suggested in #195.

G3zz

Attempting to merge with alpaca-lora and its quantization

18

I was attempting to merge alpaca-lora from https://huggingface.co/tloen/alpaca-lora-7b and the original llama-7B from https://huggingface.co/decapoda-research/llama-7b-hf, also tried to quantize the model and run main file in llama.cpp. The merge code is...

taiyou2000

enhancement

help wanted

llama.cpp
llama.cpp copied to clipboard

Metadata

Add chatLLaMa script

Reducing the time needed to reload a piece of text into the model by caching the state

Supported context window length for each model?

Add precompiled files for windows x64 and python script instead of bash script.

making on linuxmint 21

Error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’ on x86_64 - better support for different x86_64 CPU instruction extensions

Q4_1 acceleration

Use RMSNorm

Add disk space requirements to README.md

Attempting to merge with alpaca-lora and its quantization

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard