TingC-95 issues

Results 15 issues of


                                            TingC-95

Why do we need to shuffle data in one particular batch?

https://github.com/Rayhane-mamah/Tacotron-2/blob/ab5cb08a931fc842d3892ebeb27c8b8734ddd4b8/tacotron/feeder.py#L201

is this paper publicly available now?

在colab使用cpu推断报错

相同的模型在GPU上推断没问题，但是在CPU上遇到如下报错： Traceback (most recent call last): File "/content/Retrieval-based-Voice-Conversion-WebUI/infer-web.py", line 141, in vc_single if_f0 = cpt.get("f0", 1) NameError: name 'cpt' is not defined Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line...

bug

Ensemble Functionality for Model Development

**Is your feature request related to a problem? Please describe.** Yes, the problem is that current models may not always provide accurate predictions, especially when dealing with complex and diverse...

enhancement

Discrepancy between my evaluation results and README for MNLI in evaluation.py

Hi, I'm running evaluation.py on MNLI as described in the README, but I'm getting different results compared to what's displayed there. I'm using Google Colab for this, and you can...

Any guidelines for tuning noise_scale_w?

I found that adjusting noise_scale_w has an effect on the smoothness of the synthesized speech When noise_scale_w is close to 1, the speech speed is slower and the speech is...

Discussion: How about distil MAS result from teacher VITS to replace the Text Aligner?

I found that VITS's MAS result is very accurate, so why not distil the duration information to train the student model?

关于学生模型

请问学生模型为啥只重用了教师模型的 enc_q 和 flow，而不重用文本编码器呢？学生模型的tuning是更适合用同一个数据集的教师模型做transfer，还是更适合用其他学生模型做transfer呢？训练学生模型一般多久收敛呀？

训练初期显存异常大

发现刚开始训练的时候，显存变化剧烈，且容易爆显存；过了一段时间之后，显存降下去且显存利用率比较低。有大佬观察到这个现象吗？这是为什么呢？

语速问题

首先感谢开源，合成效果很棒。我尝试模型finetune，发现合成的音频总是会比原始语速偏慢，即便是集内的数据。后来发现这个 https://github.com/PlayVoice/vits_chinese/blob/5b662006ff016f749e6c76a15b4e8e8210a4e1cf/models.py#L562 取ceil的操作可能是原因。但是取floor又会太快。针对这个问题是否有优化建议呢？

enhancement