whisper.cpp
whisper.cpp copied to clipboard
Translate to any language with forced decoding?
A cool feature that might be worth exploring would be allowing users to translate to any target language, rather than just english.
It is known that whisper was trained to take input language --> english only, but this repo shows that you can force whisper to decode to a specific language: https://github.com/Vaibhavs10/translate-with-whisper
for lang in list_of_languages:
whisper_asr.model.config.forced_decoder_ids = (
whisper_asr.tokenizer.get_decoder_prompt_ids(
language=lang,
task="transcribe"
)
)
print(whisper_asr(next(iter(common_voice_en))["audio"]["array"])["text"])
Would be fun to do a test on this with whisper.cpp
Originally posted by @luquitared in https://github.com/ggerganov/whisper.cpp/issues/1219#issuecomment-1998577753