Kokoro-FastAPI fix Chinese and English mixing

The original code can support pure Chinese pronunciation very well, but when it comes to mixed Chinese and English pronunciation, the English pronunciation data will be lost. I have improved this function. However, the download_model.py is missing because there is no download address for Kokoro-82M-v1.1-zh in the release. #214

Mar 13 '25 07:03 chai51

@chai51 I'm not sure I understand what exactly this is trying to fix and how/what it fixes?

Mar 26 '25 14:03 fireblade2534

@fireblade2534

Purely a guess on my part, but I think this is a request for support for "Kokoro-82M-v1.1-zh" which is supposed to be better with at least Chinese, and which only handles Chinese and English.

Mar 26 '25 14:03 RBEmerson970

When the Voice is set to zf_xiaoyi and the Language is set to Chinese, it is illustrated by the following two use cases:
"该模型是经过短期训练的结果，从专业数据集中添加了 100 名中文使用者。" The synthesized pronunciation of this text is completely accurate.
"Kokoro 是一系列体积虽小但功能强大的 TTS 模型。" In this sentence, the pronunciations of "Kokoro" and "TTS" are incorrect.
Previously, the pronunciation of mixed Chinese and English texts would lose the English part of the pronunciation. I used the new Kokoro module to deal with English and Chinese separately and solve the problem of lost English pronunciation. Since the mixed use of Chinese and English scenes is very common, this improvement will enhance the diversity of Chinese scenes.

Mar 27 '25 10:03 chai51

@chai51 So what your saying is that if the language is Chinese instead of loading the normal kokoro v1 model it loads kokoro v1.1. Some questions are:

What converts the chinese and English text to phenomes in such a way that both English pronunciation and Chinese pronunciation is maintained
Does it load v1.1 and v1 at the same time
Does it use the text normalization system for the English parts of text or is that skipped (Text normalization only works for English right now so it is automatically disabled if the lang code requests a different language)

Mar 28 '25 16:03 fireblade2534

@chai51 So what your saying is that if the language is Chinese instead of loading the normal kokoro v1 model it loads kokoro v1.1. Some questions are:

What converts the chinese and English text to phenomes in such a way that both English pronunciation and Chinese pronunciation is maintained

Does it load v1.1 and v1 at the same time

Does it use the text normalization system for the English parts of text or is that skipped (Text normalization only works for English right now so it is automatically disabled if the lang code requests a different language)

Yeah, you're right.

The submitted code api/src/inference/kokoro_v1.py:87 passes a callback function to KPipeline, which separates the Chinese and English parts, api/src/inference/kokoro_v1.py:61 The English part is returned by callback function and synthesized using a-KPipeline. Specific implementation reference make_zh.py
Only v1.1 was loaded
Since English will only exist as words or abbreviations in scenarios where mixed pronunciation is used, the English part of the text normalization system is omitted

Mar 31 '25 02:03 chai51

Hey, this is a must have fix for text with mixed Chinese and English words. When can this PR be merged? It seems there is not code conflicts.

May 15 '25 07:05 thiner

Has the code been merged? I also have a need for a mix of Chinese and English

Jun 10 '25 01:06 fuyuhnag168

Could someone help to merge it? Thanks!

Jul 19 '25 20:07 happynocode

Does the author identify the changes trying to fix mixed Chinese and English words and plan to merge the branch? Please take time on this

Jul 22 '25 03:07 crazyn2

@fireblade2534 is there any barrier block this PR? If not, could you please merge this PR? It's very important for scenario with mixed Chinese and English words.

Jul 22 '25 04:07 thiner

So why hasn't it been merged? This is very important! 🙏

Nov 26 '25 17:11 tovarsh

Hi @ThatCoders, this PR has some merge conflicts and haven't gotten to it yet as we work through the backlog

Nov 26 '25 17:11 remsky

Hi @ThatCoders, this PR has some merge conflicts and haven't gotten to it yet as we work through the backlog

Got it, thank you for your reply and your contribution to open source.

Nov 26 '25 17:11 tovarsh