MaxMax2016
MaxMax2016
1,录制儿化音的数据 2,制作儿化音的发音字典:https://github.com/Executedone/Chinese-FastSpeech2/blob/main/lexicon/pinyin-lexicon-r.txt [pinyin-lexicon-r.txt](https://github.com/PlayVoice/vits_chinese/files/11740836/pinyin-lexicon-r.txt) 3,训练模型
导入:import tools.vits_chinese.monotonic_align 使用:monotonic_align.maximum_path 这样用?
you can freeze post encoder and hifigan decoder,maybe also text encoder, then use add kl_loss_r finetune a well trained VITS model (train flow).
> 关注下,这个loss 加上有效果不,有过对比吗 with this loss, less sound error than raw model.
i have not meet this problem, because baker is high precision labeled. I wonder, if the data is labeled incorrectly, kl_loss_r will propagate error to subsequent modules(post encoder and hifigan...
@feng-yufei https://github.com/PlayVoice/vits_chinese/issues/36#issuecomment-1463298450 YES it increases mel loss.
是的,使用普通话的拼音可能会导致一些发音问题
This model is used to solve Chinese prosodic problems. So, what is your goal?
 https://github.com/PlayVoice/vits_chinese/blob/master/train.py#L267 may this is useful for you: https://github.com/heatz123/naturalspeech
@nivibilla i think an english bert may be usefull The characters used by bert have multiple pronunciation units, two in Chinese and n in English.The bert vector of each character...