whisper_android How can I not automatically translat

My voice input in Japanese is always automatically translated into English. How can I not automatically translate?

Nov 21 '24 06:11 zhaoliwen

Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English

I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.

You can find google colab notebook to generate model in one of the comment on this repo.

Nov 21 '24 12:11 vilassn

I use below webside generate whisper-base.tflite , but alway translatin to English. Is there any parameter that needs to be adjusted? https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb#scrollTo=05QUPteUnXPL

Nov 22 '24 04:11 zhaoliwen

Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English

I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.

You can find google colab notebook to generate model in one of the comment on this repo.

Even I also facing the same issue, is there any update regarding this??

Dec 10 '24 06:12 devikiran99

Ive tried generating qa tflite model as well with above colab and another one ive found through the comments, but sadly i cant get it to work. It would be awesome if someone could point us in the right direction so we can have a model that just transcribes in the language thats in the input.

Dec 21 '24 15:12 Leonm99

I tried the pre-built app from your repo and tested with German. Sometimes it translates to English, sometimes it transcribes to German, and once it even replied to my question...

Dec 26 '24 14:12 woheller69

Using the whisper-small.tflite from here:

https://github.com/usefulsensors/openai-whisper/blob/main/models/whisper-small.tflite

works a lot better for multi-lingual transcription. Of course also a lot slower...

Dec 27 '24 10:12 woheller69

Thanks for sharing here @woheller69 !

For now I'm relying on the Open AI API and using their whisper model though there. I will give it another shot later on since I really would like to do transcription on device to save on cost :)

Have a great day.

Dec 27 '24 11:12 Leonm99

You can try the apk in my releases: https://github.com/woheller69/whisper_android/releases It uses whisper-small for multi-lingual and whisper-tiny for English only. I also modified the UI. Press and hold the button while speaking. When the button is released the input is transcribed.

Dec 27 '24 20:12 woheller69

it works a lot better if audio input is normalized

Jan 01 '25 16:01 woheller69

In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?

Jan 02 '25 07:01 woheller69

In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?

When I was trying your app I experienced the same behavior. I think maybe the model needs to be purpose built for transcription in this case?

Would be a lot easier if we could just use the original models on device :/ for now I've stuck with the Open AI API since for the project I'm working on the cost isnt that bad. But I'd still prefer to run transcription on device.. I'll just wait it out a bit until that part improves.

Jan 02 '25 07:01 Leonm99

It seems to be triggered by some words which may sound English. If I change one word or speak more precisely it works.

Jan 02 '25 07:01 woheller69

Unfortunately I don't know how to create these tflite models. It would be great if someone could help and provide a Wisper Small tflte with transcription function only.

Jan 02 '25 07:01 woheller69

V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days

Jan 07 '25 10:01 woheller69

V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days

Which models did you use, if you don't mind sharing. Did you find them or did you just make them with the collab?

Jan 07 '25 12:01 Leonm99

I made them. The .tflites and a Colab are here: https://huggingface.co/DocWolle/whisper_tflite_models

I essentially created 2 signatures:

one which sets the translation token in forced_decoder_ids
one which sets the transcription token in forced_decoder_ids

So as long as the model detects the correct language it will do what it is expected to do

Jan 07 '25 12:01 woheller69

can any one please provide me the .tflite models which transcribe any languages. Also explain how to generate those models?

Jan 09 '25 09:01 devikiran99

In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance

Jan 09 '25 11:01 woheller69

In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance

The models you provided transcribe and translate the audio into English. For example, if the audio is in Hindi, the result is the English translation of the Hindi content. However, what I want is for the audio in Hindi to be transcribed directly into Hindi text.

Jan 09 '25 11:01 devikiran99

No. The new model will transcribe to Hindi if you run signature "serving_transcribe" And it will translate to English if running "serving_translate"

Jan 09 '25 11:01 woheller69

If you speak Hindi this offline translator app might also be interesting for you. https://github.com/woheller69/seemless

It is also AI based and uses the Seemless M4T small model (which for a phone is quite big, actually). It can translate between Hindi, English, Spanish, Portuguese, and Russian

Jan 09 '25 13:01 woheller69