jovo-framework icon indicating copy to clipboard operation
jovo-framework copied to clipboard

[Feature Request] Enable offline Speech-to-Text and Text-to-Speech

Open JRMeyer opened this issue 2 years ago • 6 comments

👋 hi there!

I'm submitting a...

  • [ ] Bug report
  • [x] Feature request
  • [ ] Documentation issue or request
  • [ ] Other... Please describe:

Expected Behavior

Would be great to be able to test and debug a voice bot without an internet connection. Offline STT and TTS (from @coqui-ai) would make this possible using the existing UX from the new jovo debugger

Current Behavior

Currently there's no offline STT or TTS

JRMeyer avatar Feb 16 '22 02:02 JRMeyer

Hi there. Thank you.

This is not on our immediate roadmap, but would be a great community contribution.

Coqui STT could be implemented as Jovo ASR integration.

jankoenig avatar Feb 16 '22 10:02 jankoenig

Hi @jankoenig -- just looked into the integration with Lex, and it would be considerably different with Coqui because the user would have their own server running. For example, the user might be running a simple server on their local desktop or they might have spun up a server on their AWS cloud, and using endpoints there. In either case, the API syntax and integration would be identical, but there would be an expectation that the user spins up the server themselves. Not too difficult, but I'm not sure if that's something the Jovo crowd would be interested in.

I think the biggest value add for Jovo users would be to be able to test out their voicebots locally, without having an ASR backend running on one of the providers (like Lex).

Thoughts?

JRMeyer avatar Feb 17 '22 20:02 JRMeyer

This could work similar to our Snips NLU integration where people also have to run their own servers.

An integration like this would also be useful for our web starters:

  • https://github.com/jovotech/jovo-starter-web-standalone
  • https://github.com/jovotech/jovo-starter-web-standalone-react

jankoenig avatar Feb 18 '22 09:02 jankoenig

Yeah, I think a general setup mirroring the Snips approach would work nicely. You know of anyone in your community who might like to hack on this? We're happy to offer support/guidance for using the Coqui tools.

JRMeyer avatar Feb 18 '22 21:02 JRMeyer

I think I could give this a spin :)

rubenaeg avatar Feb 18 '22 22:02 rubenaeg

@JRMeyer Are there any developer docs on the Coqui APIs for STT and TTS using Node.js or REST?

rmtuckerphx avatar Aug 26 '22 05:08 rmtuckerphx