LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

Expose Chatterbox TTS model arguments on the Chatterbox backend

Open rampa3 opened this issue 5 months ago • 2 comments

Is your feature request related to a problem? Please describe.

Chatterbox TTS as a voice cloning TTS posses parameters which need to be provided together with the audio prompt for tuning the output voice produced. These parameters are: exageration, temperature and cfg_weight. Without access to them, tuning the output is impossible.

Describe the solution you'd like

I would like to propose exposing these parameters on the API or at least in model file to allow adjusting them on voice by voice basis.

Describe alternatives you've considered

Finding the best preset for most of user's voices using external Chatterbox install, and building the backend with this preset from source. (Applicable only for advanced users.)

rampa3 avatar Aug 06 '25 12:08 rampa3

Makes sense. We could use the options field, like for example we do in diffusers: https://github.com/mudler/LocalAI/blob/master/backend/python/diffusers/backend.py#L173

mudler avatar Aug 07 '25 07:08 mudler

Makes sense. We could use the options field, like for example we do in diffusers: https://github.com/mudler/LocalAI/blob/master/backend/python/diffusers/backend.py#L173

Is the options field also usable in model files, or just on the API? API is great for actively tuning, but for final parameters, one might want to save them in the YAML. Sorry if I ask about something obvious, I am still lost in some parts of the backends code.

rampa3 avatar Aug 07 '25 13:08 rampa3