whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

App is getting into endless loop

Open MosheMaorKaltura opened this issue 1 year ago • 5 comments

I am running whisper.cpp inside docker, as a POC I translated 800 WAV files.

In few cases (less then 5), the client is getting into endless loop at a certain time of the audio. If I tell it to start second after the loop point, it transcript the audio as expected. Few notes -

  1. The audio is english
  2. It happens across all models
  3. Using the python version - translate is fine.
  4. It is consistent always in the same point of time That is the output that I get (I change some of the text for privacy manaers): ..... [00:00:30.000 --> 00:00:32.000] Some valid text bla bla [00:00:32.000 --> 00:00:35.000] Some valid text bla bla [00:00:35.000 --> 00:00:53.000] Some valid text bla bla [00:00:53.000 --> 00:01:09.000] Some valid text bla bla [00:01:09.000 --> 00:01:30.000] We were working at high school. [00:01:30.000 --> 00:01:45.000] We were working at high school. [00:01:45.000 --> 00:02:06.000] We were working at high school. [00:02:06.000 --> 00:02:21.000] We were working at high school. [00:02:21.000 --> 00:02:40.000] We were working at high school. ... Goes like that up to the end.

MosheMaorKaltura avatar Feb 16 '23 14:02 MosheMaorKaltura

Yeah, I've had the same happening to me with Chinese audio files... It only happens for some very few positions in the audio, but when it does happen it goes on until the end of the audio, and it also seems to happen consistently when I restart the transcription process...

LaurenzV avatar Feb 16 '23 22:02 LaurenzV

I also have the same problem when I have to transcribe in Portuguese. Is there any solution for that?

guardiaopt avatar Feb 18 '23 15:02 guardiaopt

Have you checked this: https://github.com/ggerganov/whisper.cpp/discussions/408#discussioncomment-4780401

geimist avatar Feb 18 '23 16:02 geimist

With the parameter "--max-context 0", it no longer loops infinitely until the end of the file, but it still has some loops, but they are relatively small. Thank you!

guardiaopt avatar Feb 18 '23 18:02 guardiaopt

This behavior occurs when the entropy-based repetition detection fails. It can be sometimes mitigated by adjusting the entropy threshold as explained here:

https://github.com/ggerganov/whisper.cpp/issues/471#issuecomment-1416947068

More robust strategy needs to be implemented.

Alternatively, you can try to use the beam-search decoder, but it will make the processing slower.

ggerganov avatar Feb 19 '23 06:02 ggerganov