omi icon indicating copy to clipboard operation
omi copied to clipboard

Alternative STT Models

Open maxfahl opened this issue 11 months ago • 3 comments

The current transcription model works great with English, however I speak Swedish all day and it kind of sucks for this language.

OpenAI Whisper v2 works wonderfully with Swedish. I would love to be able to select this model. I know there are costs coupled with this, but I wouldn’t hesitate on using my own API key for this.

I remember there being a setting in the app earlier where you could select between two models, but it seems to have been removed.

Maybe you could select an API to use under Developer Options, where OpenAI whisper could be an option.

maxfahl avatar Jan 11 '25 10:01 maxfahl

Actually, soniox is better than deepgram. they also do pretty good job on Chinese Mandarin.

goodpeter-sun avatar Jan 27 '25 08:01 goodpeter-sun

Is it using Sonox now? It doesn't really matter since the transcription I get I completely gibberish anyway. Anyone in staff that might answer this? I'd rather user whisper v2 or something. I suppose I can fork and build the app myself, but would really prefer having a choice in the app that already exists.

maxfahl avatar Feb 17 '25 06:02 maxfahl

will link to #1249 - there is discussion regarding STT models. and probably it will be great to do it on-device.

skywinder avatar Mar 10 '25 12:03 skywinder

that's needed for sure. I will handle it later, saved in my fork to not forget to notify you here, @maxfahl https://github.com/skywinder/omi/issues/2

skywinder avatar May 12 '25 12:05 skywinder