faster-whisper
faster-whisper copied to clipboard
beam_size size not working if using converted model
I want to use cahya/whisper-large-id
ct2-transformers-converter --model "cahya/whisper-large-id" \
--output_dir "cahya-whisper-large-id-ct2" --quantization float16
but changing beam_size has no effect, it always return 30 second segments, I want under 5 second
model = WhisperModel("cahya-whisper-large-id-ct2", device="cuda", compute_type="float16")
segments, _ = model.transcribe("voice.wav", beam_size=1, language="id")
for segment in list(segments):
print("[%.2f -> %.2f] %s" % (segment.start, segment.end, segment.text))
beam_size is not related to segments duration, it's size of beam search.
I don't know, but it work with model large-v3
default
with beam_size=1
On some other audio you can observe opposite effect.