distil-whisper
distil-whisper copied to clipboard
Long-Form transcription with Faster Whisper
Hi, I have been working on faster whisper and trying to use the distil-whisper model. However, distil-whisper supports 30s of audio chunks and using it with faster whisper only outputs the first 30 seconds.
How can it be used with the faster-whisper implementation?
Hey @9throok - cool to see that you're using Distil-Whisper in combination with Faster-Whisper! I believe the .transcribe
method in Faster-Whisper handles the long-form generation algorithm: https://github.com/guillaumekln/faster-whisper#usage Is this the API that you've been using? If you could share a reproducible code snippet that showcases the behaviour you're seeing that would be great, thanks!
@9throok, any update on the issue that you mentioned?
Hi, I have been working on faster whisper and trying to use the distil-whisper model. However, distil-whisper supports 30s of audio chunks and using it with faster whisper only outputs the first 30 seconds.
I had same issue, after the first chunk nada in output, then looked at debug - distill model just hallucinated non stop after the first chunk, solution is to disable context prompt, initial prompt has negative effect too.
How can it be used with the faster-whisper implementation?
Now it has official support -> https://github.com/SYSTRAN/faster-whisper/commit/ad3c83045bc0748b744e064ddfda680c86662e7e
Or you can use the standalone executable -> https://github.com/Purfview/whisper-standalone-win