How can I switch the model to hexgrad/Kokoro-82M-v1.1-zh? What should I do?

Mar 03 '25 16:03 zhy844694805

Wait until it's out of beta. They removed several voices for now until it's production ready if I understand correctly. Is there a particular reason you want to use this version? Does it have any exciting features I don't know about yet??

Mar 04 '25 14:03 gitchat1

1.获取资源

sudo apt install espeak-ng

git lfs install
cd api/src/models
git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh
mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh

2.修改代码

default_voice: str = "af_heart"

改成zf_094音频样本，使kokoro使用中文加载

        "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"

改为"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"

    if language not in lang_map:
        raise ValueError(f"Unsupported language code: {language}")

    return EspeakBackend(lang_map[language])

改为return EspeakBackend("cmn")

3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices...选择z开头的均可 Language将Auto改为Chinese

Mar 10 '25 09:03 chai51

1.获取资源

sudo apt install espeak-ng

git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh 2.修改代码

default_voice: str = "af_heart" 改成zf_094音频样本，使kokoro使用中文加载
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
改为"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为return EspeakBackend("cmn")

3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices...选择_z_开头的均可 Language将Auto改为Chinese

return EspeakBackend(lang_map[language])这一句在哪个文件？

Mar 10 '25 15:03 luoxxib

1.获取资源

sudo apt install espeak-ng

git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh 2.修改代码

default_voice: str = "af_heart" 改成zf_094音频样本，使kokoro使用中文加载
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
改为"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为return EspeakBackend("cmn")

3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices...选择_z_开头的均可 Language将Auto改为Chinese

按你的步骤改了，发出来的音调不对

Mar 10 '25 15:03 luoxxib

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng

git lfs install
cd api/src/models
git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh
mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh

pip uninstall kokoro
pip install kokoro
pip install misaki[zh]

2.修改代码

api/src/core/config.py

    allow_local_voice_saving: bool = (
        False  # Whether to allow saving combined voices locally
    )

改为

    allow_local_voice_saving: bool = (
        False  # Whether to allow saving combined voices locally
    )
    repo_id: str = "hexgrad/Kokoro-82M"

api/src/core/model_config.py

    # Model filename
    pytorch_kokoro_v1_file: str = Field(
        "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
    )

改为

    # Model filename
    pytorch_kokoro_v1_file: str = Field(
        "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
    )

api/src/inference/kokoro_v1.py

            # 第一块
            self._model = KModel(config=config_path, model=model_path).eval()

            # 第二块
            self._pipelines[lang_code] = KPipeline(
                lang_code=lang_code, model=self._model, device=self._device
            )

改为

            # 第一块
            self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


            # 第二块
            self._pipelines[lang_code] = KPipeline(
                lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
            )

api/src/inference/model_manager.py

                warmup_text = "Warmup text for initialization."

改为

                warmup_text = "初始化的预热文本。"

api/src/services/text_processing/phonemizer.py

    if language not in lang_map:
        raise ValueError(f"Unsupported language code: {language}")

    return EspeakBackend(lang_map[language])

改为

    return EspeakBackend("cmn")

3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh
export DEFAULT_VOICE=zf_094
export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/
Search voices选择 z 开头的均可
Language将Auto改为Chinese

Mar 11 '25 05:03 chai51

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng

git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh

pip uninstall kokoro pip install kokoro pip install misaki[zh]

2.修改代码

api/src/core/config.py
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
        # 第一块
        self._model = KModel(config=config_path, model=model_path).eval()

        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device
        )
改为
        # 第一块
        self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
        )
api/src/inference/model_manager.py
            warmup_text = "Warmup text for initialization."
改为
            warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese

谢谢，按照你的这个步骤，正常发音了

Mar 11 '25 10:03 luoxxib

@chai51 刚在玩这个，就看到你的提交了，我cherry-pick 你的commit 后还需要做什么，能否介绍一下直接拉代码之后要做哪些事情?

我用的mac ，尤其于某些原因不能使用docker,所以才用了direct run 的方式。
download_model.py 跑不了，所以我手动下载1.0 模型 kokoro-v1_0.pth到了 Kokoro-FastAPI/api/src/models/v1_0 我没有装 sudo apt install espeak-ng，因为我使用的是mac，这个好像不是很必要,我不太清楚

Mar 13 '25 08:03 fastfading

@chai51 刚在玩这个，就看到你的提交了，我cherry-pick 你的commit 后还需要做什么，能否介绍一下直接拉代码之后要做哪些事情?

我用的mac ，尤其于某些原因不能使用docker,所以才用了direct run 的方式。 download_model.py 跑不了，所以我手动下载1.0 模型 kokoro-v1_0.pth到了 Kokoro-FastAPI/api/src/models/v1_0 我没有装 sudo apt install espeak-ng，因为我使用的是mac，这个好像不是很必要,我不太清楚

espeak-ng你可以问下deepseek是什么作用，使用我的commit后，步骤1需要将huggingface上下载的资源放到对应的位置，步骤3的变量名改变了，具体可以看start-gpu.sh里面新增的注释。差不多就可以了，如果还有什么问题，参考上面步骤，适当的做调整，相信你一定没有问题的。

Mar 13 '25 12:03 chai51

由于该模型对英语使用者具有价值，因此用英语分享这些信息将使更多人受益。

Because the model has value to English speakers, it would benefit more people to share this information in English.

Mar 13 '25 14:03 RBEmerson970

Wait until it's out of beta. They removed several voices for now until it's production ready if I understand correctly. Is there a particular reason you want to use this version? Does it have any exciting features I don't know about yet??

In the Kokoro-82M-v1.1-zh version, the output Chinese voice can be used normally. In other versions, the Chinese speech tone is pronounced using English intonation. To put it in perspective, you can understand it as a Japanese person speaking English.

Mar 21 '25 13:03 zhy844694805

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng

git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh

pip uninstall kokoro pip install kokoro pip install misaki[zh]

2.修改代码

api/src/core/config.py
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
        # 第一块
        self._model = KModel(config=config_path, model=model_path).eval()

        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device
        )
改为
        # 第一块
        self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
        )
api/src/inference/model_manager.py
            warmup_text = "Warmup text for initialization."
改为
            warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese

奇怪，为什么我按照这个步骤来，中文读出来全部像东北话？ That's odd, after I followed these steps, it sounds like Northeast dialect?

Apr 14 '25 10:04 SunixLiu

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng

git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh

cp -r v1_1-zh/voices ../voices/v1_1-zh

pip uninstall kokoro pip install kokoro pip install misaki[zh]

2.修改代码

api/src/core/config.py
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
        # 第一块
        self._model = KModel(config=config_path, model=model_path).eval()

        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device
        )
改为
        # 第一块
        self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
        )
api/src/inference/model_manager.py
            warmup_text = "Warmup text for initialization."
改为
            warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese

纯中文可以，中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'

Apr 29 '25 03:04 Jacknolfskin

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh cp -r v1_1-zh/voices ../voices/v1_1-zh pip uninstall kokoro pip install kokoro pip install misaki[zh]

2.修改代码

api/src/core/config.py
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
        # 第一块
        self._model = KModel(config=config_path, model=model_path).eval()

        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device
        )
改为
        # 第一块
        self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
        )
api/src/inference/model_manager.py
            warmup_text = "Warmup text for initialization."
改为
            warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
纯中文可以，中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'

@Jacknolfskin ZHG2P是misaki的模块，需要更新misaki。更新方式：升级pyproject.toml中如下两个库到版本0.9.4 "kokoro==0.9.4", "misaki[en,ja,ko,zh]==0.9.4", 再重新安装py依赖即可

May 25 '25 16:05 remxcode

将github原工程按照这个步骤来，首先这个可能会导致你原有的环境不可用

1.获取资源

sudo apt install espeak-ng git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh cp -r v1_1-zh/voices ../voices/v1_1-zh pip uninstall kokoro pip install kokoro pip install misaki[zh]

2.修改代码

api/src/core/config.py
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
    False  # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
    "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
        # 第一块
        self._model = KModel(config=config_path, model=model_path).eval()

        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device
        )
改为
        # 第一块
        self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()


        # 第二块
        self._pipelines[lang_code] = KPipeline(
            lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
        )
api/src/inference/model_manager.py
            warmup_text = "Warmup text for initialization."
改为
            warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
    raise ValueError(f"Unsupported language code: {language}")

return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本

export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh

4.开始使用

访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
纯中文可以，中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'
@Jacknolfskin ZHG2P是misaki的模块，需要更新misaki。更新方式：升级pyproject.toml中如下两个库到版本0.9.4 "kokoro==0.9.4", "misaki[en,ja,ko,zh]==0.9.4", 再重新安装py依赖即可

升级后还是英文还是无法正常发声, 有办法能读出来吗? 中英混合场景缺少一段就不连贯了

Jul 01 '25 03:07 leiax00

https://github.com/remsky/Kokoro-FastAPI/pull/237

这个解决了我的问题

Jul 01 '25 03:07 leiax00

#237

这个解决了我的问题

解决了英文漏掉的问题吗

Oct 11 '25 03:10 cosyman