llama.cpp
llama.cpp copied to clipboard
LLM inference in C/C++
@ggerganov Thanks for sharing llama.cpp. As usual, great work. Question rather than issue. How difficult would it be to make ggml.c work for a Flan checkpoint, like T5-xl/UL2, then quantized?...
Hi, Just test on RISC-V board: 4xC910 2.0G TH1520 LicheePi4A (https://sipeed.com/licheepi4a) with 16GB LPDDR4X. about 6s/token without any instruction acceleration, and it should be '' 15597 -> 'They' sampling parameters:...
I haven't found a consistent pattern to reproduce this, but sometimes the model will continue outputting text even after it has printed the reverse prompt. If colors are enabled, they...
### Someone please take over this pull request? Unfortunately, I'm quite behind on a few other obligations so I won't be able to continue exploring here. Feel free to take...
Should fix #289
On Win 10 ``` > docker run -v /llama/models:/models ghcr.io/ggerganov/llama.cpp:full –all-in-one “/models/” 7B Downloading model… Traceback (most recent call last): File “/app/./download-pth.py”, line 3, in from tqdm import tqdm ModuleNotFoundError:...
Commit c9f670a (Implement non-greedy tokenizer that tries to maximize token lengths) breaks llama?
Old version: ``` .\build\Release\llama.exe -m C:\...\models\30B\ggml-model-q4_0.bin -t 10 -n 256 --seed 100 --temp 0.2 -p "list all US states in alphabetical order:" output: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut,...
``` (base) dave@macbook-pro llama.cpp % make I llama.cpp build info: I UNAME_S: Darwin I UNAME_P: arm I UNAME_M: arm64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE I CXXFLAGS:...
Can someone please confirm the following md5 sums are correct? I regenerated them with the latest code. ``` $ md5sum ./models/*/*.pth | sort -k 2,2 0804c42ca65584f50234a86d71e6916a ./models/13B/consolidated.00.pth 016017be6040da87604f77703b92f2bc ./models/13B/consolidated.01.pth f856e9d99c30855d6ead4d00cc3a5573...
In interactive mode: ``` Bob: Sure. The largest city in Europe is Moscow, the capital of Russia. User: xxx ``` Press CTRL+C, the program exits, but terminal color still remains...