Feature request: allow explicit "lang_code" parameter in /v1/audio/speech

Open giorgiootto opened this issue 4 months ago • 1 comments

Description:

Hello, I’m currently using kokoro-api via the /v1/audio/speech endpoint and I’d like to explicitly set the synthesis language. I noticed that the API doesn’t officially expose a lang_code parameter in the request body, and language is currently inferred only from the voice value.

Current behavior:

When the selected voice does not exactly match the desired language, there’s no way to force the language via request.

In scenarios with multiple mixed voices (e.g., "pf_dora+af_sky"), the main language may not be correctly inferred.

This limits compatibility with existing OpenAI-compatible clients, where lang_code or language is a common parameter.

Proposed solution:

Add official support for a lang_code field (e.g., "p" for Brazilian Portuguese) in the request payload.

If provided, this parameter should override the language inferred from the voice.

Benefits:

Gives developers more control over the synthesis language.

Improves compatibility with existing SDKs (e.g., Python/JS openai clients) that already allow language or lang_code.

Enables multi-language scenarios without relying solely on specific voice names.

Desired usage example:

POST /v1/audio/speech { "model": "kokoro", "voice": "pf_dora", "lang_code": "p", // Brazilian Portuguese "input": "Olá, este é um teste de voz em português do Brasil." }

Aug 11 '25 22:08 giorgiootto

I've looked at the Kokoro package this repo uses, and they likely didn't include this feature because the lang_code is specified in a sub-object called KPipeline, which can take a few second to re-initialize.

I am working on a minimalistic repo for Kokoro at https://github.com/skwzrd/kokoro_slim. At the moment, there is no API with OpenAI compatibility, but allows you to change the language code via a web page, and python wrapper.

Aug 13 '25 05:08 skwzrd