whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Translate to any language with forced decoding?

Open luquitared opened this issue 1 year ago • 0 comments

A cool feature that might be worth exploring would be allowing users to translate to any target language, rather than just english.

It is known that whisper was trained to take input language --> english only, but this repo shows that you can force whisper to decode to a specific language: https://github.com/Vaibhavs10/translate-with-whisper

for lang in list_of_languages:
    whisper_asr.model.config.forced_decoder_ids = (
        whisper_asr.tokenizer.get_decoder_prompt_ids(
            language=lang,
            task="transcribe"
            )
        )
    print(whisper_asr(next(iter(common_voice_en))["audio"]["array"])["text"])

Would be fun to do a test on this with whisper.cpp

Originally posted by @luquitared in https://github.com/ggerganov/whisper.cpp/issues/1219#issuecomment-1998577753

luquitared avatar Mar 14 '24 22:03 luquitared