Amphion support fs2 24k training, aligned mel setting with gan vocoder

support fs2 24k training, aligned mel setting with gan vocoder

Open qinghua2016 opened this issue 1 year ago • 3 comments

fixed fs2 24k training and inference issues, fixed hifigan vocoder training
aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.
fixed librosa usage issues

Dec 24 '23 15:12 qinghua2016

Please use black to format your code. For example, to format a file named wrong_format.py, you can run:

pip install black
black wrong_format.py

Dec 25 '23 07:12 RMSnow

fixed fs2 24k training and inference issues, fixed hifigan vocoder training

aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.

fixed librosa usage issues

Thanks for your suggestion. We tried to use the pre-trained vocoder for fs2 with the settings of 100-dimensional mel-spectrogram, but achieved poor results compared with the current settings. Therefore, we have decided to retain the current settings for FS2.

Dec 25 '23 17:12 lmxue

fixed fs2 24k training and inference issues, fixed hifigan vocoder training

aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.

fixed librosa usage issues

Thanks for your suggestion. As @lmxue mentioned, we have tried to align the FS2 with the default setting of the vocoder in the early stage but resulted in poor performance. If you have got good results with the aligned setting, please attach the pretrained model as well as some demos in this PR and we will process your PR then.

Dec 26 '23 15:12 VocodexElysium

Hi, thanks for your suggeston. We're closing this issue for the reasons mentioned by @lmxue and @VocodexElysium , where we tried to use the default vocoder but resulted in poor performance. If you have got good results with the aligned setting, you're welcome to reopen the PR. Thanks!

Jul 12 '24 02:07 jiaqili3

Amphion Amphion copied to clipboard

support fs2 24k training, aligned mel setting with gan vocoder

Amphion
Amphion copied to clipboard