yanzhuangzhuang-beep
yanzhuangzhuang-beep
I have data LJSpeech-1.1 but it's wavs data .so it's path wav_dir. You said I need dowoload data which I don't have the href. could you provide the url ,thanks
ok I have finish nvidia_preprocessing. when i run train_fastspeech have a error no key "hp.model.phoneme_acoustic_embed" I find that in config.yaml no key model..phoneme_acoustic_embed and the value
In yaml, I set model.phoneme_acoustic_embed =True then,self.utterance_encoder(ys.transpose(1, 2)).transpose(1, 2) error ndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
have you solved this problem?
hello ,have you found what's the value Phonetic acoustic embedding
have you got it?
重点是我不会使用MFA ... ![1638442790(1)](https://user-images.githubusercontent.com/62825785/144409721-7275cee1-20f5-4e44-a04f-475e3b87a9ca.png) 这是我用标贝可以跑起来但是nan 我感觉是数据有杂乱的
使用标贝后 长句依然有问题
Thank you, I have found this problem and solved it
我之前也是这样,我觉得是词库不是完整导致有的索引为空 数据不对齐。 但是当我补充完词汇后发现了新的问题 不知道是不是采样率的问题 你可以先打印缺少的字符 在text/system中