faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

When testing Chinese, there are no punctuation marks in the results!

Open Yaodada12 opened this issue 1 year ago • 4 comments

I use both faster-whisper-v2 and faster-whisper-v3.

from faster_whisper import WhisperModel

model = WhisperModel("large-v3")

segments, info = model.transcribe("zh_audio.mp3")
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Yaodada12 avatar Jan 31 '24 09:01 Yaodada12

@Yaodada12 , hello. From my test, large-v3 gave poor quality and no punctuation. But large-v2 gave quite good quality. Then I tried to add option condition_on_previous_text=False with large-v3 model and I found that the quality has improved a lot. Can you try again with this option ? My code logic:

model = WhisperModel('large-v3', device='cuda')
segments, info = model.transcribe('zh.m4a', word_timestamps=True, condition_on_previous_text=False)

trungkienbkhn avatar Feb 01 '24 02:02 trungkienbkhn

@Yaodada12 , hello. From my test, large-v3 gave poor quality and no punctuation. But large-v2 gave quite good quality. Then I tried to add option condition_on_previous_text=False with large-v3 model and I found that the quality has improved a lot. Can you try again with this option ? My code logic:

model = WhisperModel('large-v3', device='cuda')
segments, info = model.transcribe('zh.m4a', word_timestamps=True, condition_on_previous_text=False)

Thanks,i will try.

Yaodada12 avatar Feb 01 '24 08:02 Yaodada12

same issue, i use large-v2 for ZH.

hscspring avatar Apr 10 '24 09:04 hscspring

@hscspring I cannot even transcribe 'zh'

mru4913 avatar Apr 25 '24 09:04 mru4913