llama.cpp
llama.cpp copied to clipboard
Misc. bug: llama-server does not print model loading errors by default (log level misconfigured?)
Name and Version
build: 4205 (c6bc7395) with Apple clang version 16.0.0 (clang-1600.0.26.3) for arm64-apple-darwin23.6.0
Operating systems
Mac
Which llama.cpp modules do you know to be affected?
llama-server
Problem description & steps to reproduce
When using a model with e.g. an incompatible pre-tokenizer, the loading error isn't shown by default; llama-server seems to just quit.
$ ./llama-server -m model-CoT-Q4_K_M.gguf
build: 4205 (c6bc7395) with Apple clang version 16.0.0 (clang-1600.0.26.3) for arm64-apple-darwin23.6.0
[...]
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 11
main: loading model
srv load_model: loading model 'model-CoT-Q4_K_M.gguf'
llama_load_model_from_file: using device Metal (Apple M2 Max) - 49151 MiB free
llama_model_loader: loaded meta data with 25 key-value pairs and 579 tensors
[...]
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q4_K: 289 tensors
llama_model_loader: - type q6_K: 49 tensors
$
(The exit code is 1.)
Adding -v seems to bump up the logging level enough for errors to be shown:
$ ./llama-server -v -m model-CoT-Q4_K_M.gguf
[...]
llama_model_loader: - type q4_K: 289 tensors
llama_model_loader: - type q6_K: 49 tensors
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_load_model_from_file: failed to load model
common_init_from_params: failed to load model 'model-CoT-Q4_K_M.gguf'
srv load_model: failed to load model, 'model-CoT-Q4_K_M.gguf'
main: exiting due to model loading error
I think errors should be shown by default.