llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

LLM inference in C/C++

Results 1628 llama.cpp issues
Sort by recently updated
recently updated
newest added

With these changes llama3.2 model could be converted to big endian.

python

### Name and Version > .\llama-server.exe --version ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes Device...

bug

I'm using llama.cpp to deploy deepseek-r1-671B-Q4_0 weights, but I found documention/README.md is barely detailed; I even have to read the source to understand what would happen if I make some...

### Name and Version ./llama-server --version version: 4607 (aa6fb132) built with Apple clang version 15.0.0 (clang-1500.1.0.2.5) for arm64-apple-darwin23.4.0 ### Operating systems Mac ### Which llama.cpp modules do you know to...

bug-unconfirmed

### Name and Version on latest commit ce8784bdb153ff7794dde5a50b0ebfa51baa6171 but have been noticing it for several days now ### Operating systems _No response_ ### Which llama.cpp modules do you know to...

enhancement
good first issue

This PR adds a bulletpoint to the contributing guidelines stating that PRs should not contain multiple, unrelated features.

### Git commit 4418 ### Operating systems BSD ### GGML backends CPU ### Problem description & steps to reproduce [This -D_XOPEN_SOURCE=600 argument](https://github.com/ggerganov/llama.cpp/blob/master/Makefile#L286) breaks compilation: ``` In file included from /usr/ports/misc/llama-cpp/work/llama.cpp-b4418/ggml/src/ggml-vulkan/ggml-vulkan.cpp:8:...

bug-unconfirmed

### Name and Version ./build/bin/llama-cli --version version: 4731 (0f2bbe65) built with cc (conda-forge gcc 12.2.0-19) 12.2.0 for aarch64-conda-linux-gnu ### Operating systems Linux ### GGML backends CPU ### Hardware ### NPU(8...

bug-unconfirmed

### Name and Version ./llama-cli --version ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 2080 Ti, compute capability 7.5, VMM: yes version:...

bug-unconfirmed

Remove unused header file that causes compilation failure on ARM platform with GCC 13. *Make sure to read the [contributing guidelines](https://github.com/ggml-org/llama.cpp/blob/master/CONTRIBUTING.md) before submitting a PR*

ggml
Ascend NPU