faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

Faster Whisper large-v2 model is repeating the segments

Open Talhazeb opened this issue 2 years ago • 17 comments

Hi i am currently using faster whisper large-v2 model with german language, and it is repeating the same text in loop. I am not able to find the issue in the faster whisper, but the same file whith openai/whisper does not produces the same segements in loop. here is my code for the transcriptioin using faster whisper large-v2. ` from faster_whisper import WhisperModel import time

model_whisper=WhisperModel("large-v2", device="cuda", compute_type="float32",device_index=[0])

segments,info=model_whisper.transcribe("../../test-audio/dog.MP3",beam_size=5,language='de',vad_filter=True) for segment in segments: print(segment.text) ` ouput produces the same segments: Guten Tag und herzlich Willkcmen bei Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Ich rufe aus "private data". Ich rufe aus "private data". der Deutschen t&dien. Ich interessiere mich für den Glasfaserausbau. Ich interessiere mich für den Glasfaserausbau. Ich würde gerne einen Beratungstermin vereinbaren. Ich würde gerne einen Beratungstermin vereinbaren. Gibt mir eine ganz kurze Postleitzahl. Gibt mir eine ganz kurze Postleitzahl. Beratung können Sie nur online beantragen. Beratung können Sie nur online beantragen.

Talhazeb avatar Aug 14 '23 10:08 Talhazeb

Could you share the audio sample to reproduce the issue?

Purfview avatar Aug 14 '23 17:08 Purfview

@Purfview Sure, I can send you on email. Can you kindly share you email here?

Talhazeb avatar Aug 14 '23 17:08 Talhazeb

purfview [@] protonmail [.] com

Purfview avatar Aug 14 '23 17:08 Purfview

sent

Talhazeb avatar Aug 14 '23 18:08 Talhazeb

I didn't got any repeats, my settings used: --device=cpu --language=de --model=large-v2 --compute_type=float32 --beam_size=5 --vad_filter=False

Make sure you are using the latest 0.7.1 version.

Purfview avatar Aug 14 '23 19:08 Purfview

@Talhazeb Did you try with the latest version?

guillaumekln avatar Aug 28 '23 09:08 guillaumekln

@guillaumekln Yes with latest version (0.7.1)

Talhazeb avatar Aug 28 '23 09:08 Talhazeb

Do you get repeats with settings I used? Only differences from yours were device=cpu and vad_filter=False.

Purfview avatar Aug 28 '23 14:08 Purfview

+1 here. The large-v1 model is worked. large-v2 or medium are not worked

zyokia avatar Aug 29 '23 05:08 zyokia

Please share the input audio file if possible.

guillaumekln avatar Aug 29 '23 08:08 guillaumekln

@Purfview I need the vad_filter since disabling it creates problem for other audio files. @guillaumekln can you kindly share your mail, I can send you on that. Thanks

Talhazeb avatar Aug 29 '23 08:08 Talhazeb

guillaume [.] klein [@] systrangroup [.] com

guillaumekln avatar Aug 29 '23 09:08 guillaumekln

@guillaumekln sent

Talhazeb avatar Aug 29 '23 09:08 Talhazeb

The VAD filter is creating the problem here. You can try making the filter more conservative, for example by increasing the minimum silence duration from 2 seconds to 3 seconds:

model.transcribe(..., vad_filter=True, vad_parameters=dict(min_silence_duration_ms=3000))

Note that openai/whisper does not apply a separate VAD filter.

guillaumekln avatar Aug 29 '23 10:08 guillaumekln

@guillaumekln Thanks a lot for checking it out and letting me know. I will check and let you know.

Talhazeb avatar Aug 29 '23 16:08 Talhazeb

I have also likely faced the same issue. Adjusting the min_silence_duration_ms parameter causes the phenomenon of repeating the same segment to occur in other places. It would be very time-consuming if we have to repeatedly test each audio file to find out which value to set to prevent such occurrences. I would like to automate this part as well.

makoto-toyouke avatar Aug 30 '23 10:08 makoto-toyouke

@makoto-toyouke That's true. its a troublesome process as noise vary from 1 audio to other. Any one got any solution?

Haarris avatar Feb 19 '25 12:02 Haarris