How can I not automatically translat
My voice input in Japanese is always automatically translated into English. How can I not automatically translate?
Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English
I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.
You can find google colab notebook to generate model in one of the comment on this repo.
I use below webside generate whisper-base.tflite , but alway translatin to English. Is there any parameter that needs to be adjusted? https://colab.research.google.com/github/usefulsensors/openai-whisper/blob/main/notebooks/generate_tflite_from_whisper.ipynb#scrollTo=05QUPteUnXPL
Multilingual model support transcription and translation. For Transcription, it supports Any -> Any For Translation, it supports Any -> English
I see your requirement is transcription. But, the generated multilingual model in this git repo is for translation. For Transcription, you need to generate model.
You can find google colab notebook to generate model in one of the comment on this repo.
Even I also facing the same issue, is there any update regarding this??
Ive tried generating qa tflite model as well with above colab and another one ive found through the comments, but sadly i cant get it to work. It would be awesome if someone could point us in the right direction so we can have a model that just transcribes in the language thats in the input.
I tried the pre-built app from your repo and tested with German. Sometimes it translates to English, sometimes it transcribes to German, and once it even replied to my question...
Using the whisper-small.tflite from here:
https://github.com/usefulsensors/openai-whisper/blob/main/models/whisper-small.tflite
works a lot better for multi-lingual transcription. Of course also a lot slower...
Thanks for sharing here @woheller69 !
For now I'm relying on the Open AI API and using their whisper model though there. I will give it another shot later on since I really would like to do transcription on device to save on cost :)
Have a great day.
You can try the apk in my releases: https://github.com/woheller69/whisper_android/releases It uses whisper-small for multi-lingual and whisper-tiny for English only. I also modified the UI. Press and hold the button while speaking. When the button is released the input is transcribed.
it works a lot better if audio input is normalized
In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?
In my app https://github.com/woheller69/whisperIME based on the Java (not native) code of this project I can now display the detected language. Sometimes it detects the correct language German but translates to English instead of transcribing. If I repeat the sentece again I get it in German. Any ideas?
When I was trying your app I experienced the same behavior. I think maybe the model needs to be purpose built for transcription in this case?
Would be a lot easier if we could just use the original models on device :/ for now I've stuck with the Open AI API since for the project I'm working on the cost isnt that bad. But I'd still prefer to run transcription on device.. I'll just wait it out a bit until that part improves.
It seems to be triggered by some words which may sound English. If I change one word or speak more precisely it works.
Unfortunately I don't know how to create these tflite models. It would be great if someone could help and provide a Wisper Small tflte with transcription function only.
V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days
V1.8 of my app uses new models which support translation and transcription. Will be on F-Droid in a few days
Which models did you use, if you don't mind sharing. Did you find them or did you just make them with the collab?
I made them. The .tflites and a Colab are here: https://huggingface.co/DocWolle/whisper_tflite_models
I essentially created 2 signatures:
- one which sets the translation token in forced_decoder_ids
- one which sets the transcription token in forced_decoder_ids
So as long as the model detects the correct language it will do what it is expected to do
can any one please provide me the .tflite models which transcribe any languages. Also explain how to generate those models?
In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance
In my comment above you have the link. These models transcribe any language, you just cannot specify the language in advance
The models you provided transcribe and translate the audio into English. For example, if the audio is in Hindi, the result is the English translation of the Hindi content. However, what I want is for the audio in Hindi to be transcribed directly into Hindi text.
No. The new model will transcribe to Hindi if you run signature "serving_transcribe" And it will translate to English if running "serving_translate"
If you speak Hindi this offline translator app might also be interesting for you. https://github.com/woheller69/seemless
It is also AI based and uses the Seemless M4T small model (which for a phone is quite big, actually). It can translate between Hindi, English, Spanish, Portuguese, and Russian