Whisper icon indicating copy to clipboard operation
Whisper copied to clipboard

endless loop repetition

Open eagleftw023 opened this issue 1 year ago • 21 comments

hi i got repetition/hallucination lines when trying to transcribe repeated lines have the same words (it stop transcriping and just list the same words repeated) how can i solve this please?

eagleftw023 avatar Apr 14 '23 10:04 eagleftw023

-mc 0 helps but it's a larger issue..

KjeldsenDK avatar Apr 14 '23 13:04 KjeldsenDK

where i can do this? and how ?

-mc 0 helps but it's a larger issue..

eagleftw023 avatar Apr 14 '23 13:04 eagleftw023

You can specify it using the commandline version. Also, the issue was reported very often, first occurence in this repository is from myself: https://github.com/Const-me/Whisper/issues/26

emcodem avatar Apr 14 '23 14:04 emcodem

i use the desktop version do you mean that i should downloa the source and open it in visual studio ?

You can specify it using the commandline version. Also, the issue was reported very often, first occurence in this repository is from myself: #26

eagleftw023 avatar Apr 14 '23 14:04 eagleftw023

Sorry for the short explaination, no i did not want you to compile the source code or such. Just download and use the commandline version. Download is called clip.zip, it contains main.exe, then call on commandline like this (of course you need to modify all the paths in this example by the paths where your stuff is, or create the same paths that i specify here and place the same needed filenames there)

C:\whisper\main.exe -mc 0 -f C:\temp\test.wav -l de -m C:\whisper\models\ggml-large.bin

We can also simply use this in a batch file and drag/drop files to translate on the bat file. Create a new empty text file on your desktop (or anywhere else) with the following content:

"C:\whisper\main.exe" -m "C:\whisper\models\ggml-large.bin" -l de -mc 0 -osrt %1

pause

Change the path to main.exe and model to your desire (but leave the ""), also change -l de to your language code and -mc 0 to -mc 224 if you want to raise transcribe quality but (currently) add a risk of falling into repetition loop. You can change -osrt to ovtt or -otxt, this is the option that enables writing a text file. Save the file and drag/drop some audio file on the .bat file. A cmd window should open while transcribe is running and you can close it to cancel or when finished.

A further improvement could be to add a simple registry key that adds a rightclick menu on .mp3 .wav files to start the batch and transcribe. If anyone is interested, please ask.

emcodem avatar Apr 14 '23 15:04 emcodem

Fashionable late to repeat the above :-) Yes, -mc 0 is a command line option, the gui has limited options, and to be honest, i spend hours searching for a detailed explanation on all the command line options, and didn't succeed. The -mc 0 was from another thread on the subject

KjeldsenDK avatar Apr 14 '23 15:04 KjeldsenDK

thanks alot this helped and its now work with no loops but i get a message sometimes runFullImpl: failed to generate timestamp token - skipping one second is there any solution to this ?

eagleftw023 avatar Apr 14 '23 15:04 eagleftw023

@Const-me any chance we can get --max-context exposed in the GUI without having to use the command line? At least until the repetitions issue gets fixed by OpenAI and it trickles down.

albino1 avatar Apr 14 '23 22:04 albino1

Sorry for the short explaination, no i did not want you to compile the source code or such. Just download and use the commandline version. Download is called clip.zip, it contains main.exe, then call on commandline like this (of course you need to modify all the paths in this example by the paths where your stuff is, or create the same paths that i specify here and place the same needed filenames there)

C:\whisper\main.exe -mc 0 -f C:\temp\test.wav -l de -m C:\whisper\models\ggml-large.bin

I downloaded this repository as a ZIP file, but I couldn't find the main.exe in it.Where can I find this one.

aszk1415 avatar Apr 15 '23 01:04 aszk1415

The compiled binaries and source code are 2 different things :) The cli program main.exe is in clip.zip, you can download it where you download all other compiled stuff: https://github.com/Const-me/Whisper/releases/

emcodem avatar Apr 15 '23 01:04 emcodem

@emcodem hi do you know how to solve this? runFullImpl: failed to generate timestamp token - skipping one second

eagleftw023 avatar Apr 15 '23 01:04 eagleftw023

@eagleftw023 no, i mention the same in https://github.com/Const-me/Whisper/issues/26 As this message comes at the same spot where it started repeating before for me, i believed that this might be the actual root cause of all the troubles. However, the general line in this repository seems to be that we wait for the guys from whisper original project (or the guys from the whisper cpp project) to come up with a solution so we can copy what they do in this repository

emcodem avatar Apr 15 '23 09:04 emcodem

whisper.cpp pushed an update today in v1.30 that supposedly might help with the repetitions:

https://github.com/ggerganov/whisper.cpp/releases/tag/v1.3.0

The key commit is this one:

https://github.com/ggerganov/whisper.cpp/commit/f19e23fbd108ec3ac458c7a19b31c930719e7a94

Hopefully we'll see this quickly ported over to this project.

albino1 avatar Apr 15 '23 16:04 albino1

whisper.cpp pushed an update today in v1.30 that supposedly might help with the repetitions:

https://github.com/ggerganov/whisper.cpp/releases/tag/v1.3.0

The key commit is this one:

ggerganov/whisper.cpp@f19e23f

Hopefully we'll see this quickly ported over to this project.

@Const-me

eagleftw023 avatar Apr 16 '23 12:04 eagleftw023

Just tested it with latest ggerganov master, the problem is not solved. I don't understand why he thinks the problem can just be solved by reducing the parallel decoders but he for sure has his reasons

emcodem avatar Apr 25 '23 16:04 emcodem

users of the cli version can try the same workaround that i use, there is a download here https://github.com/Const-me/Whisper/issues/26

emcodem avatar Apr 27 '23 16:04 emcodem

Ok, newbie here: I used the command line and it seemed to improve it a bit, however: where is my txt file saved? :D

tondeaf avatar May 09 '23 01:05 tondeaf

@tondeaf check out help by executing main.exe without any parameters. e.g. adding -osrt to the command line params will write a srt file next to the input file.

emcodem avatar May 10 '23 20:05 emcodem

Thank you! (can you set a name for the output text file or will it just always be the input file +txt?)

tondeaf avatar May 11 '23 08:05 tondeaf

Saw this posted today on whisper.cpp if anybody wants to give it a try:

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

https://github.com/EtienneAb3d/WhisperHallu

Related OpenAI:

https://github.com/openai/whisper/discussions/679

albino1 avatar May 12 '23 17:05 albino1

Well preprocessing is a topic in general but i fear it is not a general purpose solution for the looping topic as it will usually make some situations better and other worse. As there are like indefinite ways of preprocessing and none of them will be perfect for all usecases, i feel like the core projects like whisper cpp and Const-me version should not attempt preprocessing at all. Instead they should provide interfaces that allow easy connection of preprocessors and the core software.

@tondeaf i don't think main.exe has yet a way to specify the srt output filename/location.

emcodem avatar May 12 '23 18:05 emcodem