whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Feature request - transcription + translation in another language

Open swswsws583 opened this issue 2 years ago • 7 comments
trafficstars

I'm hoping to make some bilingual subtitles for my videos, it would be great if you can add this feature, or hopefully real-time transcription + translation in the future. Thanks for all the great work 😄

swswsws583 avatar Aug 28 '23 21:08 swswsws583

Hello.

I use for example from English to Spanish "-l es" and it works for me.

jcalderita avatar Aug 30 '23 05:08 jcalderita

Hello.

I use for example from English to Spanish "-l es" and it works for me.

Hi, Thanks for your reply. I was not trying to transcribe a non-English audio file and translating it into English, but transcribing and translating from any language to another language locally.

swswsws583 avatar Aug 30 '23 22:08 swswsws583

I'm hoping to make some bilingual subtitles for my videos, it would be great if you can add this feature, or hopefully real-time transcription + translation in the future. Thanks for all the great work 😄

I'm pretty sure whispher.cpp other laguages.

Hi, I am not sure what you meant, but I guess you can see my response to guranu.

swswsws583 avatar Aug 30 '23 22:08 swswsws583

I'm hoping to make some bilingual subtitles for my videos, it would be great if you can add this feature, or hopefully real-time transcription + translation in the future. Thanks for all the great work 😄

I'm pretty sure whispher.cpp other laguages.

Hi, I am not sure what you meant, but I guess you can see my response to guranu.

Oh i misread that, well it won't happen because the original whisper ai by openai can't do translations in diffrent in laguages but it can transcribe in diffrent languages.

Since the -tr argument provides translation into English, I wonder if whisper.cpp can offer other language outputs.

swswsws583 avatar Aug 31 '23 10:08 swswsws583

OpenAI's Whisper currently only handles Any-to-English translations. If you're interested in Any-to-Any translations, you might want to check out Meta's latest Seamless-M4T.

bobqianic avatar Sep 01 '23 17:09 bobqianic

Thanks for sharing this!

swswsws583 avatar Sep 01 '23 18:09 swswsws583

I actually think this is possible with whisper but is unclear how it would impact performance.

This repo shows that you can force whisper to decode to a specific language: https://github.com/Vaibhavs10/translate-with-whisper

for lang in list_of_languages:
    whisper_asr.model.config.forced_decoder_ids = (
        whisper_asr.tokenizer.get_decoder_prompt_ids(
            language=lang,
            task="transcribe"
            )
        )
    print(whisper_asr(next(iter(common_voice_en))["audio"]["array"])["text"])

Would be fun to do a test on this with whisper.cpp

luquitared avatar Mar 14 '24 22:03 luquitared