MaxMax2016 comments

Results 243 comments of


                                            MaxMax2016

项目用的crepe比较容易跑调，更换为rmvpe提取基频可能会好些

是的，你可以尝试使用下面两个项目 https://github.com/thestmitsuki/so-vits-svc-rmvpe https://github.com/DLSeed/so-vits-svc-5.0

whisper長度切割問題

> 那麼想請問如果刻意把音頻都分割為30秒 > 那麼跟正常切割2~15秒做訓練应该没什么区别，同样的数据、不同batch_size和learning_rate会影响结果

processing multiple files at once

报错

检查下目录whisper-vits-svc-bigvgan-mix-v2/speaker 是否存在呢？

报错

![speaker](https://github.com/PlayVoice/whisper-vits-svc/assets/16432329/c81f5768-946c-42d5-a793-68ae282f8718) 这个文件夹

whisper and hubert

Use whisper in order to pronounce each word clearly, and Use HuBERT soft to make up for pronunciation details.

whisper and hubert

https://github.com/fishaudio/chinese-hubert-soft

Issue while running prepare/preprocess_crepe.py file.

https://github.com/pytorch/pytorch/issues/18413 maybe one of your audio is too short or empty.

Issue while running prepare/preprocess_crepe.py file.

good suggestion

模型分为：text encoder，wave encoder，Flow，decoder，dur；其中wave encoder只有在训练中使用，因此学生模型可以直接使用；text encoder是被训练来拟合Flow的分布，decoder是被训练来解码wave encoder的输出；因此蒸馏的时候，使用teacher的wave encoder&Flow，并且冻结他们网络参数，训练具有更少参数的学生模型：它的text encoder去拟合Flow的分布，它的decoder去解码wave encoder的输出。 @uloveqian2021