whisper_android icon indicating copy to clipboard operation
whisper_android copied to clipboard

How can I not automatically translat

Open zhaoliwen opened this issue 1 year ago • 21 comments

My voice input in Japanese is always automatically translated into English. How can I not automatically translate?

zhaoliwen avatar Nov 21 '24 06:11 zhaoliwen

Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English

I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.

You can find google colab notebook to generate model in one of the comment on this repo.

vilassn avatar Nov 21 '24 12:11 vilassn

I use below webside generate whisper-base.tflite , but alway translatin to English. Is there any parameter that needs to be adjusted? https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb#scrollTo=05QUPteUnXPL

zhaoliwen avatar Nov 22 '24 04:11 zhaoliwen

Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English

I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.

You can find google colab notebook to generate model in one of the comment on this repo.

Even I also facing the same issue, is there any update regarding this??

devikiran99 avatar Dec 10 '24 06:12 devikiran99

Ive tried generating qa tflite model as well with above colab and another one ive found through the comments, but sadly i cant get it to work. It would be awesome if someone could point us in the right direction so we can have a model that just transcribes in the language thats in the input.

Leonm99 avatar Dec 21 '24 15:12 Leonm99

I tried the pre-built app from your repo and tested with German. Sometimes it translates to English, sometimes it transcribes to German, and once it even replied to my question...

woheller69 avatar Dec 26 '24 14:12 woheller69

Using the whisper-small.tflite from here:

https://github.com/usefulsensors/openai-whisper/blob/main/models/whisper-small.tflite

works a lot better for multi-lingual transcription. Of course also a lot slower...

woheller69 avatar Dec 27 '24 10:12 woheller69

Thanks for sharing here @woheller69 !

For now I'm relying on the Open AI API and using their whisper model though there. I will give it another shot later on since I really would like to do transcription on device to save on cost :)

Have a great day.

Leonm99 avatar Dec 27 '24 11:12 Leonm99

You can try the apk in my releases: https://github.com/woheller69/whisper_android/releases It uses whisper-small for multi-lingual and whisper-tiny for English only. I also modified the UI. Press and hold the button while speaking. When the button is released the input is transcribed.

woheller69 avatar Dec 27 '24 20:12 woheller69

it works a lot better if audio input is normalized

woheller69 avatar Jan 01 '25 16:01 woheller69

In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?

woheller69 avatar Jan 02 '25 07:01 woheller69

In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?

When I was trying your app I experienced the same behavior. I think maybe the model needs to be purpose built for transcription in this case?

Would be a lot easier if we could just use the original models on device :/ for now I've stuck with the Open AI API since for the project I'm working on the cost isnt that bad. But I'd still prefer to run transcription on device.. I'll just wait it out a bit until that part improves.

Leonm99 avatar Jan 02 '25 07:01 Leonm99

It seems to be triggered by some words which may sound English. If I change one word or speak more precisely it works.

woheller69 avatar Jan 02 '25 07:01 woheller69

Unfortunately I don't know how to create these tflite models. It would be great if someone could help and provide a Wisper Small tflte with transcription function only.

woheller69 avatar Jan 02 '25 07:01 woheller69

V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days

woheller69 avatar Jan 07 '25 10:01 woheller69

V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days

Which models did you use, if you don't mind sharing. Did you find them or did you just make them with the collab?

Leonm99 avatar Jan 07 '25 12:01 Leonm99

I made them. The .tflites and a Colab are here: https://huggingface.co/DocWolle/whisper_tflite_models

I essentially created 2 signatures:

  • one which sets the translation token in forced_decoder_ids
  • one which sets the transcription token in forced_decoder_ids

So as long as the model detects the correct language it will do what it is expected to do

woheller69 avatar Jan 07 '25 12:01 woheller69

can any one please provide me the .tflite models which transcribe any languages. Also explain how to generate those models?

devikiran99 avatar Jan 09 '25 09:01 devikiran99

In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance

woheller69 avatar Jan 09 '25 11:01 woheller69

In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance

The models you provided transcribe and translate the audio into English. For example, if the audio is in Hindi, the result is the English translation of the Hindi content. However, what I want is for the audio in Hindi to be transcribed directly into Hindi text.

devikiran99 avatar Jan 09 '25 11:01 devikiran99

No. The new model will transcribe to Hindi if you run signature "serving_transcribe" And it will translate to English if running "serving_translate"

woheller69 avatar Jan 09 '25 11:01 woheller69

If you speak Hindi this offline translator app might also be interesting for you. https://github.com/woheller69/seemless

It is also AI based and uses the Seemless M4T small model (which for a phone is quite big, actually). It can translate between Hindi, English, Spanish, Portuguese, and Russian

woheller69 avatar Jan 09 '25 13:01 woheller69