Fail to transcribe in Chinese

Open mru4913 opened this issue 1 year ago • 1 comments

I have tried the following code according to README.md.

from faster_whisper import WhisperModel
import time

model_size = "./faster-distil-whisper-large-v2"

model = WhisperModel(model_size, device="cuda", compute_type="float16")

t1 = time.perf_counter()
segments, info = model.transcribe(
    "............./../0.mp3",
    # beam_size=5,
    language="zh",
    condition_on_previous_text=False,
)
print(time.perf_counter() - t1)
print(
    "Detected language '%s' with probability %f"
    % (info.language, info.language_probability)
)
for i in segments:
    print(i.text)

output is :

0.06374595290981233
Detected language 'zh' with probability 1.000000
 to me, so I want to say that I want to say,
 if you're if you're to my their research to try to...

Audio is in Chinese (madarain), I couldn't figure out why it outputs in English. Any help will be appreciated.

Apr 25 '24 09:04 mru4913

Distil models are English only, you need to use a multilanguage model.

Apr 26 '24 10:04 Purfview