faster-whisper Faster Whisper large-v2 model is repeating the segments

Hi i am currently using faster whisper large-v2 model with german language, and it is repeating the same text in loop. I am not able to find the issue in the faster whisper, but the same file whith openai/whisper does not produces the same segements in loop. here is my code for the transcriptioin using faster whisper large-v2. ` from faster_whisper import WhisperModel import time

model_whisper=WhisperModel("large-v2", device="cuda", compute_type="float32",device_index=[0])

segments,info=model_whisper.transcribe("../../test-audio/dog.MP3",beam_size=5,language='de',vad_filter=True) for segment in segments: print(segment.text) ` ouput produces the same segments: Guten Tag und herzlich Willkcmen bei Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Herzlich Willkcmen bei der Deutschen Ich rufe aus "private data". Ich rufe aus "private data". der Deutschen t&dien. Ich interessiere mich für den Glasfaserausbau. Ich interessiere mich für den Glasfaserausbau. Ich würde gerne einen Beratungstermin vereinbaren. Ich würde gerne einen Beratungstermin vereinbaren. Gibt mir eine ganz kurze Postleitzahl. Gibt mir eine ganz kurze Postleitzahl. Beratung können Sie nur online beantragen. Beratung können Sie nur online beantragen.

Aug 14 '23 10:08 Talhazeb

Could you share the audio sample to reproduce the issue?

Aug 14 '23 17:08 Purfview

@Purfview Sure, I can send you on email. Can you kindly share you email here?

Aug 14 '23 17:08 Talhazeb

purfview [@] protonmail [.] com

Aug 14 '23 17:08 Purfview

sent

Aug 14 '23 18:08 Talhazeb

I didn't got any repeats, my settings used: --device=cpu --language=de --model=large-v2 --compute_type=float32 --beam_size=5 --vad_filter=False

Make sure you are using the latest 0.7.1 version.

Aug 14 '23 19:08 Purfview

@Talhazeb Did you try with the latest version?

Aug 28 '23 09:08 guillaumekln

@guillaumekln Yes with latest version (0.7.1)

Aug 28 '23 09:08 Talhazeb

Do you get repeats with settings I used? Only differences from yours were device=cpu and vad_filter=False.

Aug 28 '23 14:08 Purfview

+1 here. The large-v1 model is worked. large-v2 or medium are not worked

Aug 29 '23 05:08 zyokia

Please share the input audio file if possible.

Aug 29 '23 08:08 guillaumekln

@Purfview I need the vad_filter since disabling it creates problem for other audio files. @guillaumekln can you kindly share your mail, I can send you on that. Thanks

Aug 29 '23 08:08 Talhazeb

guillaume [.] klein [@] systrangroup [.] com

Aug 29 '23 09:08 guillaumekln

@guillaumekln sent

Aug 29 '23 09:08 Talhazeb

The VAD filter is creating the problem here. You can try making the filter more conservative, for example by increasing the minimum silence duration from 2 seconds to 3 seconds:

model.transcribe(..., vad_filter=True, vad_parameters=dict(min_silence_duration_ms=3000))

Note that openai/whisper does not apply a separate VAD filter.

Aug 29 '23 10:08 guillaumekln

@guillaumekln Thanks a lot for checking it out and letting me know. I will check and let you know.

Aug 29 '23 16:08 Talhazeb

I have also likely faced the same issue. Adjusting the min_silence_duration_ms parameter causes the phenomenon of repeating the same segment to occur in other places. It would be very time-consuming if we have to repeatedly test each audio file to find out which value to set to prevent such occurrences. I would like to automate this part as well.

Aug 30 '23 10:08 makoto-toyouke

@makoto-toyouke That's true. its a troublesome process as noise vary from 1 audio to other. Any one got any solution?

Feb 19 '25 12:02 Haarris