whisper.cpp
whisper.cpp copied to clipboard
Fine-tuned Whisper models are very slow
I tried to run medium whisper fine-tuned models.
- The standard open-ai models are fast. (infer in half the audio file time, 16threads, 1porcess)
- Even the standard Hugging Face whisper models are fast. (infer in half the audio file time, 16threads, 1porcess)
But, The fine-tuned whisper models are not fast at all
- they take more than double the audio file to transcribe, and get worse as the file time increases.)
- Even when scaling the number of threads and processes there is no benefit.
Would really like to know if anyone is facing the same issue, and how to solve this. @ggerganov .. My insights on this are..
- Usually I have observed after fine-tuning the hugging face models are also not predicting the time-stamps. It just hard segments at 30 seconds. Maybe that has something to do with it.
- Or, the way.. cpp counter-parts of fine-tuned models are created may have an effect. Although that didn't affect the non fine-tuned hugging face whisper models.
The fallback implementation currently is suboptimal and I think this is causing the slow performance.
Try using --no-fallback for now, and in the future we will try to improve the performance for fallbacks
@ggerganov aweseomee!!! this worked. For me, it transcribes a 30 seconds segment in 17-18 seconds, which is relatively fast! With fallback, it varies a lot and many times takes more than 2 minutes.
All your works are mindblowing! Thank you for existing! Huge inspiration, keep pushing the boundaries!
Perhaps this is a stupid question, but how do I implement --no-fallback with a finetuned model using the huggingface pipeline? I'm having a finetuned version of whisper-medium.en take 8 seconds for a 3 second clip.
Perhaps this is a stupid question, but how do I implement --no-fallback with a finetuned model using the huggingface pipeline? I'm having a finetuned version of whisper-medium.en take 8 seconds for a 3 second clip.
you can just provide it as an argument to whisper.cpp
./main -m ../ggml-small-model.bin -l si -bs 0 -d 7000 --no-fallback -debug -bo 1 -pp -ps -t 16 -f samples/test.wav