big-AGI icon indicating copy to clipboard operation
big-AGI copied to clipboard

[Roadmap] Add support for more/generic TTS output sources

Open PylotLight opened this issue 5 months ago • 7 comments

Why Currently TTS support is hardcoded for Eleven labs only, instead of allowing a generic /TTS input which returns an audio file/stream which can be used and sourced from any other self-hosted or external provider. While this request is focused on a self hosted setup, this would work for non-local use as well.

Description In the voice settings menu, allow "custom" TTS endpoint where you would put e.g http://localhost/tts perhaps with a body/payload param settings which can be custom sent at runtime to the provider. Then consume the returned audio file and use as normal.

Requirements If you can, Please break-down the changes use cases, UX, technology, architecture, etc.

  • [ ] Add new menu option to voice settings menu
  • [ ] Unhardcode 11ai support and change to generic TTS stream/static audio file format which can be consumed generically.
  • [ ] Based on selection in voice menu, use the relevant provider to generate audio.

PylotLight avatar Mar 07 '24 10:03 PylotLight

Thanks, this is a good request @PylotLight. I think the next support should come through the LocalAI TTS models.

What are your favorite options for alternative TTS?

  • [ ] LocalAI
  • [ ] Browser default TTS (Web Speech API)
  • [ ] OpenAI tts
  • [ ] Play.ht
  • [ ] ..?

enricoros avatar Mar 28 '24 16:03 enricoros

So we obviously have a goal of supporting as much as possible with as little integration work as possible right?

How generic can we make it in terms of providing tts endpoint and getting back streamed audio?

PylotLight avatar Mar 29 '24 00:03 PylotLight

From Discord image

enricoros avatar Mar 30 '24 03:03 enricoros

Note Azure has some decent TTS/SST free API options as well just as another option to add to the list: https://azure.microsoft.com/en-us/products/ai-services/text-to-speech

PylotLight avatar Apr 01 '24 13:04 PylotLight