whisper.cpp talk-llama : add check for deepseek-r1-qwen in llama-vocab.cpp

talk-llama : add check for deepseek-r1-qwen in llama-vocab.cpp

Open kristianmk opened this issue 9 months ago • 2 comments

trafficstars

talk-llama: Add a check for deepseek-r1-qwen in llama-vocab.cpp to be able to run models like unsloth/DeepSeek-R1-Distill-Qwen-32B from HuggingFace. A full sync of llama.cpp could be better if that is automated somehow.

Solves the following unknown pre-tokenizer error when running with DeepSeek-R1-Distill-Qwen-32B: llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen' llama_model_load_from_file: failed to load model No llama.cpp model specified. Please provide using -ml <modelfile>

Jan 29 '25 13:01 kristianmk

@ggerganov Could talk-llama be moved into llama.cpp? Sync whisper.cpp into llama.cpp looks simpler and less frequent.

Feb 05 '25 08:02 foldl

@ggerganov Could talk-llama be moved into llama.cpp? Sync whisper.cpp into llama.cpp looks simpler and less frequent.

It will simplify the sync, yes, but we would need to introduce SDL2 support to llama.cpp examples. And currently it would be used just for this single example. While in whisper.cpp, more examples use SDL2. So I am not very confident that it would be worth it.

Feb 05 '25 09:02 ggerganov

whisper.cpp whisper.cpp copied to clipboard

talk-llama : add check for deepseek-r1-qwen in llama-vocab.cpp

whisper.cpp
whisper.cpp copied to clipboard