whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

language/translate doesn't work for mixed-language audio

Open miyagawa opened this issue 1 year ago • 6 comments

When I give an audio file with mixed-language content (e.g. English and Japanese) as an input, I can't seem to get the transcript in both languages as they were spoken.

  • -l en (no --translate) transcribes English in English, and translates Japanese into English
  • -l en --translate transcribes English in English, and translates Japanese into English
  • -l ja (no --translate) translates English into Japanese and transcribes Japanese in Japanese.
  • -l ja --translate transcribes English in English and translates Japanese into English

It's counter intuitive to me that the --translate flag just doesn't do anything, and even without the flag it tries to translate the language anyway.

sample audio file: https://cache.rebuild.fm/podcast-ep334.mp3

Is there an option that I'm missing, to get the transcript out as they were spoken, without any translation?

miyagawa avatar Nov 17 '22 07:11 miyagawa