WhisperFusion icon indicating copy to clipboard operation
WhisperFusion copied to clipboard

Other languages and Whisper models

Open fuglu opened this issue 1 year ago • 1 comments

Hi and thanks for sharing this awesome project! 🤩

Currently it seems that only english is supported/configured but we would also like to try other languages (e.g. german) as well.

So we started with Whisper. We briefly tried using the Whisper small model instead of small.en by simply patching build-whisper.sh and rebuilding the Docker container but that doesn't seem to be the only place we have to touch here as we only get this when running the container:

INFO:root:[Whisper INFO:] New client connected

INFO:root:[Whisper INFO]: . br,pt whe int Mus............................................, eos: True
INFO:root:[Whisper INFO]: Average inference time 0.37747994336214935


INFO:root:[Whisper INFO]: .. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br., eos: True
INFO:root:[Whisper INFO]: Average inference time 0.31598156690597534

Before we dig deeper into the project (we just found it today), we thought we'd quickly ask if you might have any tips/recommendations for us or are already working on similar ideas.

Thanks again!

fuglu avatar Jan 31 '24 15:01 fuglu

Hello, thanks for the interest in the project. For the transcription part make sure to also pass the right language here:

https://github.com/collabora/WhisperFusion/blob/1de4c740954848883f911e6c97e1db105b999b82/examples/chatbot/html/js/main.js#L146

de for german.

Also, make sure to use Mistral, since phi-2 has limited support for german. Also, right now WhisperSpeech supports Polish and English only, we are working on a German version, so the output might sound a little bit strange.

zoq avatar Jan 31 '24 15:01 zoq