async-openai icon indicating copy to clipboard operation
async-openai copied to clipboard

Support `other` voices for OpenAI compatible APIs

Open zatevakhin opened this issue 2 months ago • 1 comments

Currently, supported voices are limited to only OpenAI voices, without any possibilities to use this crate with other OpenAI compatible APIs providers that might have other voices.

See: https://github.com/64bit/async-openai/blob/7964f860e556664cf14746d9cc7c5088a1103145/async-openai/src/types/audio.rs#L36-L51

I would like to propose a change to have support for other voices, similarly how it was done for other models by using Other enum option.

See: https://github.com/64bit/async-openai/blob/7964f860e556664cf14746d9cc7c5088a1103145/async-openai/src/types/audio.rs#L53-L62

Here is minimal snippet of code I used.

use async_openai::{
    Client,
    config::OpenAIConfig,
    types::{CreateSpeechRequestArgs, SpeechModel, Voice},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let base_url = std::env::var("BASE_URL").unwrap_or("http://localhost:8001/v1/".into());
    let api_key = std::env::var("OPENAI_API_KEY").unwrap_or("sk-NO_NEED_FOR_REAL_KEY".into());

    let text = "Hello! Test Test Test!";

    let client = Client::with_config(
        OpenAIConfig::new()
            .with_api_key(api_key)
            .with_api_base(base_url),
    );

    let request = CreateSpeechRequestArgs::default()
        .input(text)
        .voice(Voice::Ash) // No way to set custom voice.
        .model(SpeechModel::Other(
            "speaches-ai/Kokoro-82M-v1.0-ONNX".to_string(),
        ))
        .build()?;

    let response = client.audio().speech(request).await?;

    response.save("./data/audio.mp3").await?;

    Ok(())
}

As a custom OpenAI compatible provider I have used latest (0.8.3) docker container from https://speaches.ai/ (kind of Ollama, but for TTS/STT)

zatevakhin avatar Sep 25 '25 19:09 zatevakhin

Related: #438

simbleau avatar Oct 03 '25 15:10 simbleau