whisper.cpp
whisper.cpp copied to clipboard
talk-llama : add check for deepseek-r1-qwen in llama-vocab.cpp
talk-llama: Add a check for deepseek-r1-qwen in llama-vocab.cpp to be able to run models like unsloth/DeepSeek-R1-Distill-Qwen-32B from HuggingFace. A full sync of llama.cpp could be better if that is automated somehow.
Solves the following unknown pre-tokenizer error when running with DeepSeek-R1-Distill-Qwen-32B:
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen' llama_model_load_from_file: failed to load model No llama.cpp model specified. Please provide using -ml <modelfile>
@ggerganov Could talk-llama be moved into llama.cpp? Sync whisper.cpp into llama.cpp looks simpler and less frequent.
@ggerganov Could
talk-llamabe moved intollama.cpp? Syncwhisper.cppintollama.cpplooks simpler and less frequent.
It will simplify the sync, yes, but we would need to introduce SDL2 support to llama.cpp examples. And currently it would be used just for this single example. While in whisper.cpp, more examples use SDL2. So I am not very confident that it would be worth it.