llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

@prusnak `./quantize "$i" "${i/f16/q4_0}" 2 &`

enhancement

llama.cpp seems to give bad results compared to Facebook's implementation. Here's an example simple reading comprehension prompt: > Question: "Tom, Mark, and Paul bought books: two with pictures and one...

model
generation quality

We should probably make a logo for this project. Like an image of a 🦙 and some C++

good first issue
🦙.

When I run the two commands the installer throws the following errors about halfway through the install: cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o ggml.c:1364:25:...

Is there any setting in any of the scripts to change the context limit? :) Thanks in advance!

Hello, wanted to experiment installing the system in a Linux/Debian container but I am getting the following error when I am issuing make. - "failed in call to 'always_inline' '_mm256_cvtph_ps'"...

Use cmake to create the vc++ project ,and debug in vs2022. python convert-pth-to-ggml.py models/7B/ 1 done. quantize.exe .\models\7B\ggml-model-f16.bin .\models\7B\ggml-model-q4_0.bin 2 done. llama -m .\models\7B\ggml-model-q4_0.bin -t 8 -n 128 > main:...

bug

@ggerganov , can we expect an android port like the whisper one?

build

Fix the CMake build on Linux to prevent it from failing with an error message. ``` /usr/bin/ld: libggml.a(ggml.c.o): in function `ggml_graph_compute': ggml.c:(.text+0x16960): undefined reference to `pthread_create' /usr/bin/ld: ggml.c:(.text+0x169c3): undefined reference...

We have come up with several changes to CMakeLists.txt that are expected to improve performance, compatibility, and maintainability, and we have drafted them. Change list : 1. remove _NO_ from...

enhancement
build