anything-llm
anything-llm copied to clipboard
[FEAT]: Enhanced OpenAI Compatible TTS Functionality
What would you like to see?
Currently, the TTS options are limited especially for OpenAI Compatible endpoints.
The model defaults to tts-1 and there is no option to change this. Additionally there is no response splitting, the result is an unnatural conversation flow.
This is compared to Open-WebUI where these options are available and the result when using the same endpoint offers a superior TTS experience. In fact the 'call' feature paired with matatonic/openedai-speech's repo and a GPU is a fantastic.
This feature suggestion proposes adding a method to control the model used for "openAiGeneric" TTS (perhaps still default to tts-1?) And suggests replicating the 'splitting' feature offered by Open-WebUI.