llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Misc. bug: since b4800 llama-cli does not prompt and llama-bench shows no results

Open pabpas opened this issue 5 months ago • 2 comments

Name and Version

Last working version:

$ llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | matrix cores: none
version: 4799 (14dec0c2)
built with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line


Problem description & steps to reproduce

Starting with b4800 llama-cli does not reach prompt input, it stops here:

$ llama-cli -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
[...]
main: interactive mode on.
sampler seed: 507615108
sampler params: 
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096
        top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist 
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

and llama-bench shows no results (also no error):

$ llama-bench -m llama-2-7b.Q4_0.gguf
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | matrix cores: none

First Bad Commit

https://github.com/ggml-org/llama.cpp/commit/cc473cac7cea1484c1f870231073b0bf0352c6f9

Relevant log output


pabpas avatar May 11 '25 11:05 pabpas