Amphion icon indicating copy to clipboard operation
Amphion copied to clipboard

support fs2 24k training, aligned mel setting with gan vocoder

Open qinghua2016 opened this issue 1 year ago • 3 comments

  • fixed fs2 24k training and inference issues, fixed hifigan vocoder training
  • aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.
  • fixed librosa usage issues

qinghua2016 avatar Dec 24 '23 15:12 qinghua2016

Please use black to format your code. For example, to format a file named wrong_format.py, you can run:

pip install black
black wrong_format.py

RMSnow avatar Dec 25 '23 07:12 RMSnow

  • fixed fs2 24k training and inference issues, fixed hifigan vocoder training
  • aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.
  • fixed librosa usage issues

Thanks for your suggestion. We tried to use the pre-trained vocoder for fs2 with the settings of 100-dimensional mel-spectrogram, but achieved poor results compared with the current settings. Therefore, we have decided to retain the current settings for FS2.

lmxue avatar Dec 25 '23 17:12 lmxue

  • fixed fs2 24k training and inference issues, fixed hifigan vocoder training
  • aligned mels feature for acoustic model and vocoder, so that people can use the pretrained model. fixed other data preprocess issues.
  • fixed librosa usage issues

Thanks for your suggestion. As @lmxue mentioned, we have tried to align the FS2 with the default setting of the vocoder in the early stage but resulted in poor performance. If you have got good results with the aligned setting, please attach the pretrained model as well as some demos in this PR and we will process your PR then.

VocodexElysium avatar Dec 26 '23 15:12 VocodexElysium

Hi, thanks for your suggeston. We're closing this issue for the reasons mentioned by @lmxue and @VocodexElysium , where we tried to use the default vocoder but resulted in poor performance. If you have got good results with the aligned setting, you're welcome to reopen the PR. Thanks!

jiaqili3 avatar Jul 12 '24 02:07 jiaqili3