whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Error: "whisper_full: failed to generate timestamp token - this should not happen"

Open Topping1 opened this issue 2 years ago • 4 comments

I was running a task on a german language youtube video with the command line ./main -m ggml-base.bin bauer.wav -t 8 -l de -osrt and the process ran ok until around the 4-minute mark, then I've got the error:

"whisper_full: failed to generate timestamp token - this should not happen"

repeated several times, and the transcription never resumed. I changed the command line to use 4 cores, didn´t include the srt file generation and still the same error. Curiously, if I force english transcription with "-l en", the transcription is ok until 4 minutes or so and then the same sentence repeats until the end of the file.

I think this happened after the commit to reduce the sentence length.

Topping1 avatar Oct 19 '22 03:10 Topping1

just in case you want to replicate the issue, the audio is from https://youtu.be/ZLkYpSMkgS4.

Topping1 avatar Oct 19 '22 03:10 Topping1

The sampling strategy currently is not perfect - I have also seen it fail in the described way. I will improve it in the future.

One thing you can try in your case is to retry transcription using the --offset argument to re-start the transcription right before the failure. For example, re-start from the 4-minute mark:

./main -m models/ggml-base.bin -f bauer.wav -t 8 -l de -o 240000

This might help - not sure though. Another option is to try the bigger models - for example small.

Also double-check that you have the latest master version of the repo. Do a make clean + make just in case.

On my MacBook I am able to completely transcribe the audio without issues: youtube-test0.srt.txt

ggerganov avatar Oct 19 '22 05:10 ggerganov

Thanks for the suggestion, the offset argument did the trick. I think that a nice feature to add will be to add an argument to specify the start number for the subtitles. This is because when using the offset argument, the numbering starts at 1 again. In this way one can merge 2 or more SRT files from different runs with minimal edits. Thanks again for your efforts.

Topping1 avatar Oct 20 '22 01:10 Topping1

The options is now available. For example: -on 24 will offset the index by 24

Also, I changed the -o argument to -ot or --offset-t

ggerganov avatar Oct 21 '22 15:10 ggerganov