llama.cpp
llama.cpp copied to clipboard
Eval bug: unknown pre-tokenizer type: 'deepseek-r1-qwen'
Name and Version
$./llama-cli --version version: 3680 (947538ac) built with cc (Debian 14.2.0-16) 14.2.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CPU
Hardware
Intel Celeron 1007U
Models
bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
Problem description & steps to reproduce
I would to load deepseek-r1-distill-qwen-1.5B on my old PC, but it reported error that pre-tokenizer type is unknown.
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model
main: error: unable to load model
First Bad Commit
No response
Relevant log output
Log start
main: build = 3680 (947538ac)
main: built with cc (Debian 14.2.0-16) 14.2.0 for x86_64-linux-gnu
main: seed = 1740215762
llama_model_loader: loaded meta data with 39 key-value pairs and 339 tensors from example.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = DeepSeek R1 Distill Qwen 1.5B
llama_model_loader: - kv 3: general.finetune str = Abliterated
llama_model_loader: - kv 4: general.basename str = DeepSeek-R1-Distill-Qwen
llama_model_loader: - kv 5: general.size_label str = 1.5B
llama_model_loader: - kv 6: general.base_model.count u32 = 1
llama_model_loader: - kv 7: general.base_model.0.name str = DeepSeek R1 Distill Qwen 1.5B
llama_model_loader: - kv 8: general.base_model.0.organization str = Deepseek Ai
llama_model_loader: - kv 9: general.base_model.0.repo_url str = https://huggingface.co/deepseek-ai/De...
llama_model_loader: - kv 10: general.tags arr[str,1] = ["Distill"]
llama_model_loader: - kv 11: qwen2.block_count u32 = 28
llama_model_loader: - kv 12: qwen2.context_length u32 = 131072
llama_model_loader: - kv 13: qwen2.embedding_length u32 = 1536
llama_model_loader: - kv 14: qwen2.feed_forward_length u32 = 8960
llama_model_loader: - kv 15: qwen2.attention.head_count u32 = 12
llama_model_loader: - kv 16: qwen2.attention.head_count_kv u32 = 2
llama_model_loader: - kv 17: qwen2.rope.freq_base f32 = 10000.000000
llama_model_loader: - kv 18: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 19: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 20: tokenizer.ggml.pre str = deepseek-r1-qwen
llama_model_loader: - kv 21: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 22: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 23: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 24: tokenizer.ggml.bos_token_id u32 = 151646
llama_model_loader: - kv 25: tokenizer.ggml.eos_token_id u32 = 151643
llama_model_loader: - kv 26: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 27: tokenizer.ggml.add_bos_token bool = true
llama_model_loader: - kv 28: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 29: tokenizer.chat_template str = {% if not add_generation_prompt is de...
llama_model_loader: - kv 30: general.quantization_version u32 = 2
llama_model_loader: - kv 31: general.file_type u32 = 14
llama_model_loader: - kv 32: general.url str
= https://huggingface.co/bartowski/Deep...
llama_model_loader: - kv 33: mradermacher.quantize_version str = 2
llama_model_loader: - kv 34: mradermacher.quantized_by str = mradermacher
llama_model_loader: - kv 35: mradermacher.quantized_at str = 2025-01-24T22:47:13+01:00
llama_model_loader: - kv 36: mradermacher.quantized_on str = rain
llama_model_loader: - kv 37: general.source.url str = https://huggingface.co/bartowski/Deep...
llama_model_loader: - kv 38: mradermacher.convert_type str = hf
llama_model_loader: - type f32: 141 tensors
llama_model_loader: - type q4_K: 190 tensors
llama_model_loader: - type q5_K: 7 tensors
llama_model_loader: - type q6_K: 1 tensors
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model
main: error: unable to load model