llama.cpp issues

scripts : support arbitrary input file formats in compare-llama-bench.py

3

Refactored git/file handling into specialized classes to support arbitrary input file support. For now it will fall back to JSONL if sqlite fails, but there is also support for (multiple)...

CISC

script

python

Feature Request: add draft model in llama-bench and more.

### Feature Description I do not know if it is possible, but it may be nice to add draft model in more place. If I am not wrong only the...

Djip007

enhancement

Eval bug: llama-mtmd-cli doesn't support system prompts

### Name and Version ``` $ bin/llama-mtmd-cli --version version: 5343 (62d4250e) built with cc (GCC) 15.1.1 20250425 (Red Hat 15.1.1-1) for x86_64-redhat-linux ``` ### Operating systems Linux ### GGML backends...

liamwhite

bug-unconfirmed

CUDA: add stream-based concurrency

38

Possibly supersede #16813. This PR adds support to run concurrent CUDA streams on single GPU setups. At the moment this only targets the Q, K, V branch. I feel this...

am17an

Nvidia GPU

ggml

Compile bug: Error extracting vulkan sdk in vulkan.Dockerfile build

2

### Git commit Marks-MacBook-Air:llama.cpp mjgs$ git rev-parse HEAD aa3ee0eb0b80efca126cedf9bcb4fb5864b46ce3 ### Operating systems Mac ### GGML backends Vulkan ### Problem description & steps to reproduce I'm trying to compile llama.cpp with...

mjgs

bug-unconfirmed

stale

Eval bug: Granite 4 template detection fails

12

### Name and Version ilintar@LinuksowaJaskinia:/mnt/win/k/models/unsloth/granite-4.0-h-small-GGUF$ llama-cli --version load_backend: loaded BLAS backend from /devel/tools/llama.cpp/build/bin/libggml-blas.so register_backend: registered backend BLAS (1 devices) register_device: registered device BLAS (OpenBLAS) ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no...

pwilkin

bug-unconfirmed

stale

Misc. bug: tool calls are broken

4

### Name and Version Why would anyone implement the syntax error checking of the escaped json inside the llm response in a way that does not work? What was the...

magikRUKKOLA

bug-unconfirmed

stale

Misc. bug: Various memory related issues with SYCL

2

### Name and Version vulkan: version: 6719 (aa4711d3) built with cc (GCC) 15.2.1 20250813 for x86_64-pc-linux-gnu sycl: version: 6719 (aa4711d3) built with Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205) for x86_64-unknown-linux-gnu...

AaronBeier

bug-unconfirmed

stale

ggml webgpu: add support for emscripten builds

This PR builds on and supersedes https://github.com/ggml-org/llama.cpp/pull/15826 from @ngxson. * Adds `__EMSCRIPTEN__` preprocessors conditionals in places necessary for compilation (this included some OS specific things in `common/` * Adds flags...

reeselevine

build

script

testing

devops

ggml

[SYCL]fix ci crash about SSM_CONV

Miss "break" lead to UT of SSM_CONV case fault.

NeoZhangJianyu

ggml

SYCL

llama.cpp
llama.cpp copied to clipboard

Metadata

scripts : support arbitrary input file formats in compare-llama-bench.py

Feature Request: add draft model in llama-bench and more.

Eval bug: llama-mtmd-cli doesn't support system prompts

CUDA: add stream-based concurrency

Compile bug: Error extracting vulkan sdk in vulkan.Dockerfile build

Eval bug: Granite 4 template detection fails

Misc. bug: tool calls are broken

Misc. bug: Various memory related issues with SYCL

ggml webgpu: add support for emscripten builds

[SYCL]fix ci crash about SSM_CONV

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard