mlx-audio icon indicating copy to clipboard operation
mlx-audio copied to clipboard

Request Multilingual TTS with Kokoro 1.0

Open thuongvovan opened this issue 9 months ago • 2 comments

Hi team,

This project is truly amazing! The Kokoro 1.0 model is a multilingual TTS model. I propose we integrate it with MLX-Audio to build a versatile, natural-sounding multilingual speech system. Like here.

It has huge potential for multilingual content creation and accessibility. Looking forward to your thoughts!

One note: the current implementation seems to have an issue with the Chinese (zh) voice — it sounds closer to Cantonese rather than Mandarin.

Thank you

thuongvovan avatar May 22 '25 02:05 thuongvovan

Japanese seems to have a lot of wrong pronunciation

thuongvovan avatar May 22 '25 02:05 thuongvovan

Thank you very much!

We need to investigate this further but it's hard because I only speak Portuguese, Spanish, English, and a bit of Polish and Hindi.

Could you help me with example samples where it fails alongside the correct ones? That way it will be easier for me to debug.

Blaizzy avatar May 24 '25 11:05 Blaizzy