llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

Bug encountered when running `python3 convert-pth-to-ggml.py models/7B/ 1`: ``` llama.cpp % python3 convert-pth-to-ggml.py models/7B/ 1 Traceback (most recent call last): File "/Users/jjyuhub/llama.cpp/convert-pth-to-ggml.py", line 69, in hparams = json.load(f) File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/json/__init__.py",...

When I build, the makefile detects my M1 Max as 86_64. This is because I have GNU coreutils `uname` on my `PATH`, which announces my architecture as `arm64` (whereas the...

bug
hardware
build

I propose refactoring `main.cpp` into a library (`llama.cpp`, compiled to `llama.so`/`llama.a`/whatever) and making `main.cpp` a simple driver program. A simple C API should be exposed to access the model, and...

Apologies if Github Issues is not the right place for this question, but do you know if anyone has hosted the ggml versions of the models? The disk space required...

Hi, Im getting a strange behaviour and answer: ``` ./main -m ./models/7B/ggml-model-q4_0.bin -t 8 -n 256 --repeat_penalty 1.0 --color -p "User: how many wheels have a car?" main: seed =...

By deleting line 155 (#include ) in ggml.c, it works just fine on RISC-V. Maybe this can be added in Cmake?

enhancement
hardware

Without "static" quantifier, it fails to compile in clang ``` ld.lld: error: undefined symbol: packNibbles >>> referenced by ggml.c:520 (llama_cpp/ggml.c:520) >>> .../llama_cpp/__ggml__/__objects__/ggml.c.pic.o:(quantize_row_q4_0) ld.lld: error: undefined symbol: bytesFromNibbles >>> referenced by...