vits_chinese issues

How can train on English?

6

Hey, Is it possible to adapt this model to train on English dataset? Or should I just use normal VITS?

nivibilla

怎么优化儿化音？

2

作者您好，我想基于您的代码，优化儿化音，可以给一些具体思路吗？谢谢

caifangvip

Some questions for the KL_loss and KL_loss_r and model behaviors

13

Hello MaxMax2016, Thank you for sharing your code on the improved VITS, I hope to check with you about the model behaviors when adding the bi-directional KL divergence. In this...

feng-yufei

good first issue

请问我要训练方言，需要如何训练？

1

数据源有什么要求，我看到项目里面使用到Pinyin 这个库，是不是只能用在普通话上？

zoremax

我讲该项目移植到其他项目中，作为工具使用。文件报错`vits_chinese\monotonic_align\__init__.py", line 3, in ` `from tools.vits_chinese.monotonic_align.core import maximum_path_c ModuleNotFoundError: No module named 'tools.vits_chinese.monotonic_align.core'` 因为迁移的问题，很多文件的导入，使用了绝对路径。 `import numpy as np import torch from tools.vits_chinese.monotonic_align.core import maximum_path_c def maximum_path(neg_cent, mask): """Cython...

aylitat

多说话人效果

2

您有试过加了韵律特征后在多说话人上训练嘛？我这边多说话人训练效果没有单人训练的好，单人效果非常逼真

suzhenghang

Train 的vits_prepare.py 运行错误：IndexError: Replacement index 2 out of range for positional args tuple

6

**我按README.md 来操作，如下训练有错误，请问需要怎么处理？** ### Train download baker data: https://www.data-baker.com/data/index/TNtts/ change sample rate of waves to 16kHz, and put waves to ./data/waves put 000001-010000.txt to ./data/000001-010000.txt > python vits_prepare.py -c ./configs/bert_vits.json ###...

1040003585

segment_size 对不同数据集有影响吗？

6

您好，请教一下，我的音频大概平均时长是4s左右，有25%的音频在5s 以上，最长10s，这里的segment_size设置需要变大吗？segment_size 太大GPU卡卡显存可能不够用，segment_size 在实际训练是在decoder 部分，只选取一段进行训练吗？这种对长音频的训练为了充分利用数据，需要前提先做一下截取到一个时长范围吗？

liroda

windows下环境不太好配

5

有什么好办法吗？

coderyiyang

语速问题

3

首先感谢开源，合成效果很棒。我尝试模型finetune，发现合成的音频总是会比原始语速偏慢，即便是集内的数据。后来发现这个 https://github.com/PlayVoice/vits_chinese/blob/5b662006ff016f749e6c76a15b4e8e8210a4e1cf/models.py#L562 取ceil的操作可能是原因。但是取floor又会太快。针对这个问题是否有优化建议呢？

TinaChen95

enhancement

vits_chinese
vits_chinese copied to clipboard

Metadata

How can train on English?

怎么优化儿化音？

Some questions for the KL_loss and KL_loss_r and model behaviors

请问我要训练方言，需要如何训练？

请教一个关于【迁移】项目的问题

多说话人效果

Train 的vits_prepare.py 运行错误：IndexError: Replacement index 2 out of range for positional args tuple

segment_size 对不同数据集有影响吗？

windows下环境不太好配

语速问题

← Metadata

Owner

Metadata

vits_chinese vits_chinese copied to clipboard

Metadata

← Metadata

Owner

Metadata

vits_chinese
vits_chinese copied to clipboard