whisper.cpp
whisper.cpp copied to clipboard
How to ggml-ify other fine-tuned whisper models?
Hi, I would love to use this model https://huggingface.co/pere/whisper-NST2.
I tried point it at the pytorch_model.bin file, but I receive an error:
Traceback (most recent call last):
File "convert-pt-to-ggml.py", line 209, in
Can someone point me in the right direction? thanks :)
Follow instructions from here https://github.com/ggerganov/whisper.cpp/tree/master/models#fine-tuned-models So, assuming you have whisper and whisper.cpp directories and you are in whisper directory:
git clone https://huggingface.co/pere/whisper-NST2 (you already have the .bin file so just put it in the directory) python3 models/convert-h5-to-ggml.py whisper-NST2 ../whisper custom
The last parameter (custom) is just a name of the directory where I keep my custom models. After a minute, you will have a file named custom/ggml-model.bin and you can run
./main -f input.wav -m custom/ggml-model.bin -l your_language
And that's it.
Perfect, that worked thanks for the swift reply :D
For some reason my fine tuning (following that tutorial) doesn't seem to create a vocab.json file in the checkpoint folders. Is there a step that generates that file that I could have missed? It is expected by the conversion script.
Also, I'm assuming that the checkpoint directories are the final product of training. Perhaps I am missing a finalization step somewhere?