How can I switch the model to hexgrad/Kokoro-82M-v1.1-zh? What should I do?
How can I switch the model to hexgrad/Kokoro-82M-v1.1-zh? What should I do?
Wait until it's out of beta. They removed several voices for now until it's production ready if I understand correctly. Is there a particular reason you want to use this version? Does it have any exciting features I don't know about yet??
1.获取资源
sudo apt install espeak-ng
git lfs install
cd api/src/models
git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh
mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh
2.修改代码
default_voice: str = "af_heart"
改成zf_094音频样本,使kokoro使用中文加载
"v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
改为"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
if language not in lang_map:
raise ValueError(f"Unsupported language code: {language}")
return EspeakBackend(lang_map[language])
改为return EspeakBackend("cmn")
3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices...选择z开头的均可 Language将Auto改为Chinese
1.获取资源
sudo apt install espeak-ng
git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh 2.修改代码
default_voice: str = "af_heart" 改成
zf_094音频样本,使kokoro使用中文加载"v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"改为
"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices...选择_z_开头的均可 Language将Auto改为Chinese
return EspeakBackend(lang_map[language])这一句在哪个文件?
1.获取资源
sudo apt install espeak-ng
git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh 2.修改代码
default_voice: str = "af_heart" 改成
zf_094音频样本,使kokoro使用中文加载"v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"改为
"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices...选择_z_开头的均可 Language将Auto改为Chinese
按你的步骤改了,发出来的音调不对
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng
git lfs install
cd api/src/models
git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh
mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh
pip uninstall kokoro
pip install kokoro
pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = (
False # Whether to allow saving combined voices locally
)
改为
allow_local_voice_saving: bool = (
False # Whether to allow saving combined voices locally
)
repo_id: str = "hexgrad/Kokoro-82M"
api/src/core/model_config.py
# Model filename
pytorch_kokoro_v1_file: str = Field(
"v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename"
)
改为
# Model filename
pytorch_kokoro_v1_file: str = Field(
"v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename"
)
api/src/inference/kokoro_v1.py
# 第一块
self._model = KModel(config=config_path, model=model_path).eval()
# 第二块
self._pipelines[lang_code] = KPipeline(
lang_code=lang_code, model=self._model, device=self._device
)
改为
# 第一块
self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval()
# 第二块
self._pipelines[lang_code] = KPipeline(
lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id
)
api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."
改为
warmup_text = "初始化的预热文本。"
api/src/services/text_processing/phonemizer.py
if language not in lang_map:
raise ValueError(f"Unsupported language code: {language}")
return EspeakBackend(lang_map[language])
改为
return EspeakBackend("cmn")
3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh
export DEFAULT_VOICE=zf_094
export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/
Search voices选择 z 开头的均可
Language将Auto改为Chinese
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng
git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh
pip uninstall kokoro pip install kokoro pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally )改为
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally ) repo_id: str = "hexgrad/Kokoro-82M"api/src/core/model_config.py
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename" )改为
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename" )api/src/inference/kokoro_v1.py
# 第一块 self._model = KModel(config=config_path, model=model_path).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device )改为
# 第一块 self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id )api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."改为
warmup_text = "初始化的预热文本。"api/src/services/text_processing/phonemizer.py
if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
谢谢,按照你的这个步骤,正常发音了
@chai51 刚在玩这个,就看到你的提交了, 我cherry-pick 你的commit 后还需要做什么, 能否介绍一下直接拉代码之后要做哪些事情?
我用的mac ,尤其于某些原因不能使用docker,所以才用了direct run 的方式。
download_model.py 跑不了, 所以我手动下载1.0 模型 kokoro-v1_0.pth到了 Kokoro-FastAPI/api/src/models/v1_0
我没有装 sudo apt install espeak-ng, 因为我使用的是mac, 这个好像不是很必要,我不太清楚
@chai51 刚在玩这个,就看到你的提交了, 我cherry-pick 你的commit 后还需要做什么, 能否介绍一下直接拉代码之后要做哪些事情?
我用的mac ,尤其于某些原因不能使用docker,所以才用了direct run 的方式。 download_model.py 跑不了, 所以我手动下载1.0 模型 kokoro-v1_0.pth到了 Kokoro-FastAPI/api/src/models/v1_0 我没有装 sudo apt install espeak-ng, 因为我使用的是mac, 这个好像不是很必要,我不太清楚
espeak-ng你可以问下deepseek是什么作用,使用我的commit后,步骤1需要将huggingface上下载的资源放到对应的位置,步骤3的变量名改变了,具体可以看start-gpu.sh里面新增的注释。差不多就可以了,如果还有什么问题,参考上面步骤,适当的做调整,相信你一定没有问题的。
由于该模型对英语使用者具有价值,因此用英语分享这些信息将使更多人受益。
Because the model has value to English speakers, it would benefit more people to share this information in English.
Wait until it's out of beta. They removed several voices for now until it's production ready if I understand correctly. Is there a particular reason you want to use this version? Does it have any exciting features I don't know about yet??
In the Kokoro-82M-v1.1-zh version, the output Chinese voice can be used normally. In other versions, the Chinese speech tone is pronounced using English intonation. To put it in perspective, you can understand it as a Japanese person speaking English.
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng
git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh
pip uninstall kokoro pip install kokoro pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally )改为
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally ) repo_id: str = "hexgrad/Kokoro-82M"api/src/core/model_config.py
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename" )改为
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename" )api/src/inference/kokoro_v1.py
# 第一块 self._model = KModel(config=config_path, model=model_path).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device )改为
# 第一块 self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id )api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."改为
warmup_text = "初始化的预热文本。"api/src/services/text_processing/phonemizer.py
if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
奇怪,为什么我按照这个步骤来,中文读出来全部像东北话? That's odd, after I followed these steps, it sounds like Northeast dialect?
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng
git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh
cp -r v1_1-zh/voices ../voices/v1_1-zh
pip uninstall kokoro pip install kokoro pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally )改为
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally ) repo_id: str = "hexgrad/Kokoro-82M"api/src/core/model_config.py
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename" )改为
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename" )api/src/inference/kokoro_v1.py
# 第一块 self._model = KModel(config=config_path, model=model_path).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device )改为
# 第一块 self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id )api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."改为
warmup_text = "初始化的预热文本。"api/src/services/text_processing/phonemizer.py
if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
纯中文可以,中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh cp -r v1_1-zh/voices ../voices/v1_1-zh pip uninstall kokoro pip install kokoro pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally )改为
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally ) repo_id: str = "hexgrad/Kokoro-82M"api/src/core/model_config.py
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename" )改为
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename" )api/src/inference/kokoro_v1.py
# 第一块 self._model = KModel(config=config_path, model=model_path).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device )改为
# 第一块 self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id )api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."改为
warmup_text = "初始化的预热文本。"api/src/services/text_processing/phonemizer.py
if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
纯中文可以,中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'
@Jacknolfskin ZHG2P是misaki的模块,需要更新misaki。更新方式:升级pyproject.toml中如下两个库到版本0.9.4
"kokoro==0.9.4", "misaki[en,ja,ko,zh]==0.9.4",
再重新安装py依赖即可
将github原工程按照这个步骤来,首先这个可能会导致你原有的环境不可用
1.获取资源
sudo apt install espeak-ng git lfs install cd api/src/models git clone https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh mv Kokoro-82M-v1.1-zh v1_1-zh cp -r v1_1-zh/voices ../voices/v1_1-zh pip uninstall kokoro pip install kokoro pip install misaki[zh]
2.修改代码
api/src/core/config.py
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally )改为
allow_local_voice_saving: bool = ( False # Whether to allow saving combined voices locally ) repo_id: str = "hexgrad/Kokoro-82M"api/src/core/model_config.py
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_0/kokoro-v1_0.pth", description="PyTorch Kokoro V1 model filename" )改为
# Model filename pytorch_kokoro_v1_file: str = Field( "v1_1-zh/kokoro-v1_1-zh.pth", description="PyTorch Kokoro V1 model filename" )api/src/inference/kokoro_v1.py
# 第一块 self._model = KModel(config=config_path, model=model_path).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device )改为
# 第一块 self._model = KModel(config=config_path, model=model_path, repo_id=settings.repo_id).eval() # 第二块 self._pipelines[lang_code] = KPipeline( lang_code=lang_code, model=self._model, device=self._device, repo_id=settings.repo_id )api/src/inference/model_manager.py
warmup_text = "Warmup text for initialization."改为
warmup_text = "初始化的预热文本。"api/src/services/text_processing/phonemizer.py
if language not in lang_map: raise ValueError(f"Unsupported language code: {language}") return EspeakBackend(lang_map[language])改为
return EspeakBackend("cmn")3.修改启动脚本
export VOICES_DIR=src/voices/v1_1-zh export DEFAULT_VOICE=zf_094 export REPO_ID=hexgrad/Kokoro-82M-v1.1-zh
4.开始使用
访问 http://localhost:8880/web/ Search voices选择 z 开头的均可 Language将Auto改为Chinese
纯中文可以,中英文混合会报错Generation failed: 'ZHG2P' object has no attribute 'unk'
@Jacknolfskin ZHG2P是misaki的模块,需要更新misaki。更新方式:升级pyproject.toml中如下两个库到版本0.9.4
"kokoro==0.9.4", "misaki[en,ja,ko,zh]==0.9.4",再重新安装py依赖即可
升级后还是英文还是无法正常发声, 有办法能读出来吗? 中英混合场景缺少一段就不连贯了
https://github.com/remsky/Kokoro-FastAPI/pull/237
这个解决了我的问题