baibot icon indicating copy to clipboard operation
baibot copied to clipboard

missing TTS voices from OpenAI

Open Dual-0 opened this issue 10 months ago • 1 comments

The OpenAI TTS endpoint provides 11 built‑in voices to control how speech is rendered from text. You can hear and play with these voices in OpenAI.fm. I got an error when using ballad for example. Baibot expects alloy, ash, coral, echo, fable, onyx, nova, sage or shimmer.

Error: Failed to create static agent static/openai: Yaml(Error("unknown variant `ballad`, expected one of `alloy`, `ash`, `coral`, `echo`, `fable`, `onyx`, `nova`, `sage`, `shimmer`"))

Would be nice to use all 11 voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse

Dual-0 avatar Apr 14 '25 19:04 Dual-0

This is due to a limitation in the library we're using (64bit/async-openai).

It currently only supports these voices and needs to be updated to include the new ones. At that point, we can update baibot to use the new async-openai release.


There's a closed issue about this here https://github.com/64bit/async-openai/issues/321 and a PR as well https://github.com/64bit/async-openai/pull/326

For some reason, the PR did a half-ass job and only added a few of the voices, not all (e.g. ballad is missing, and so are others like verse).

Consider creating an issue there or even doing a proper PR that fixes the voice list for both enums (Voice in types/audio.rs and ChatCompletionAudioVoice in types/chat.rs).

Once a new async-openai release is out, baibot can update to it.

spantaleev avatar Apr 15 '25 06:04 spantaleev