mycroft-core icon indicating copy to clipboard operation
mycroft-core copied to clipboard

Add Speech-to-Text backend for coqui-STT

Open lw64 opened this issue 2 years ago • 4 comments

It seems to me, that the coqui-STT project has reached a point, where it can be used as a backend. There are lots of languages available, and the performance is also very good: "it is running in realtime on a raspberry pi 4 core".

It has also the capability of streaming speech recognition, but as far as I know, that is not yet supported/used anywhere else.

I don't know if a server like for the deepspeech backend, or direct usage of coqui-STT's python bindings is better.

lw64 avatar Jan 06 '22 23:01 lw64

There's a move to plugin format for the voice services, and this should be one of the supported types soon.

el-tocino avatar Jan 07 '22 00:01 el-tocino

Coqui STT would be a straight-forward drop-in replacement for DeepSpeech, because the APIs are nearly identical :D

also - the latest English model from Coqui STT is much more accurate than the old DeepSpeech model

JRMeyer avatar Feb 15 '22 22:02 JRMeyer

I'm running Coqui STT on my Picroft as described here (as a REST API the same way DeepSpeech is currently integrated into Mycroft). I needed it to quickly work somehow so it might not be the best solution but maybe it is helpful anyway for someone planning to do it right.

hslr4 avatar Apr 13 '22 09:04 hslr4

@hslr4 maybe you could create a pull request for the integration into mycroft?

lw64 avatar Apr 14 '22 10:04 lw64