faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

distil + word_timestamps=True => CRASH

Open ExtReMLapin opened this issue 1 year ago • 4 comments

Hello, When using this finetuned version of distil whisper and trying to use word_timestamps=True it crashes when starting the transcription, no issue when word_timestamps=False

It's a CRASH, not a python error, it straight exits the python instance, no crash message, nothing, just byebye amigo hasta la vista

ExtReMLapin avatar Feb 15 '24 12:02 ExtReMLapin

Is that model working with vanilla Whisper and word_timestamps=True?

Purfview avatar Feb 15 '24 15:02 Purfview

yes

gave a try with

import whisper
# Load model
model = whisper.load_model("./models/whisper-large-v3-french-distil-dec16/original_model.pt")

# Transcribe
result = model.transcribe("./tmp0p6z2kmk_short.wav", language="fr", word_timestamps=True)
print(result)

ExtReMLapin avatar Feb 15 '24 16:02 ExtReMLapin

@ExtReMLapin , hello. I encoutered same error with you when running with word_timestamps=True. You can see this comment. My error came from the alignment_heads field in the model's config.json file. Can you re-check this file in your finetuned model ? For exact error, you can check this comment. Besides, you should add condition_on_previous_text=False to improve the transcription quality. Hope it's helpful to you.

trungkienbkhn avatar Feb 21 '24 09:02 trungkienbkhn

Hi @ExtReMLapin ,

Just fixed the issue here, thanks to @Jeronymous!

By the way, for this version, it's true that condition_on_previous_text=False will yield better performance for long-form sequential decoding, as pointed out by @trungkienbkhn.

bofenghuang avatar Mar 03 '24 20:03 bofenghuang