llama.cpp
llama.cpp copied to clipboard
LLM inference in C/C++
llama.cpp seems to give bad results compared to Facebook's implementation. Here's an example simple reading comprehension prompt: > Question: "Tom, Mark, and Paul bought books: two with pictures and one...
We should probably make a logo for this project. Like an image of a 🦙 and some C++
When I run the two commands the installer throws the following errors about halfway through the install: cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE -c ggml.c -o ggml.o ggml.c:1364:25:...
Is there any setting in any of the scripts to change the context limit? :) Thanks in advance!
Hello, wanted to experiment installing the system in a Linux/Debian container but I am getting the following error when I am issuing make. - "failed in call to 'always_inline' '_mm256_cvtph_ps'"...
Use cmake to create the vc++ project ,and debug in vs2022. python convert-pth-to-ggml.py models/7B/ 1 done. quantize.exe .\models\7B\ggml-model-f16.bin .\models\7B\ggml-model-q4_0.bin 2 done. llama -m .\models\7B\ggml-model-q4_0.bin -t 8 -n 128 > main:...
Fix the CMake build on Linux to prevent it from failing with an error message. ``` /usr/bin/ld: libggml.a(ggml.c.o): in function `ggml_graph_compute': ggml.c:(.text+0x16960): undefined reference to `pthread_create' /usr/bin/ld: ggml.c:(.text+0x169c3): undefined reference...
We have come up with several changes to CMakeLists.txt that are expected to improve performance, compatibility, and maintainability, and we have drafted them. Change list : 1. remove _NO_ from...