whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

There seems to be no Autodetect language ?

Open anubhav712 opened this issue 3 years ago • 2 comments

Language other than English needs to be specified , it doesn't auto detect the language , is it due to missing multilingual support ?

anubhav712 avatar Oct 17 '22 17:10 anubhav712

This feature is not implemented yet.

I will add sometime in the future. In the meantime, maybe someone can give it a try - the reference code is here:

https://github.com/openai/whisper/blob/main/whisper/decoding.py#L18-L69

ggerganov avatar Oct 18 '22 15:10 ggerganov

Here is a quick sketch for implementing this:

  • add language "auto" that can be passed to language parameter https://github.com/ggerganov/whisper.cpp/blob/b7c85d1ea6533fff53dd977ad6f531e19d8ff95f/whisper.h#L228
  • in whisper_full(), add auto-detect logic if params.language == "auto", probably somewhere at the start: https://github.com/ggerganov/whisper.cpp/blob/b7c85d1ea6533fff53dd977ad6f531e19d8ff95f/whisper.cpp#L2601-L2602
  • the auto-detect logic should run the encoder on the first 30s audio whisper_encode()
  • after that, it should run the decoder once whisper_decode() and sample the highest probability language token

Maybe we can add a standalone function in whisper.h that detects the language and can be used for arbitrary sound position. Something like:

const char * whisper_lang_auto_detect(struct whisper_context * ctx, int t_offset, int t_length);

ggerganov avatar Dec 10 '22 15:12 ggerganov

Thank you for implementing this! :heart:

hrehfeld avatar Dec 18 '22 22:12 hrehfeld