cotatron icon indicating copy to clipboard operation
cotatron copied to clipboard

Question on use WaveGlow instead of MelGan

Open faranaziz opened this issue 4 years ago • 1 comments

Hello, Want use WeveGlow since MelGan have a lot of sound metalic. I see config:

audio: # WARNING! This cannot be changed unlees you're planning to train the MelGAN vocoder by yourself.
  n_mel_channels: 80
  filter_length: 1024
  hop_length: 256
  win_length: 1024
  sampling_rate: 22050
  mel_fmin: 70.0
  mel_fmax: 8000.0

What need change to work with pre-trained WavGlow? I try use but I think have problem with MEL normalization since sound very noisy.

I know WavGlow use mel_fmin: 0.0, I modify and retrain but still not work.
Thanks you

faranaziz avatar Apr 11 '21 05:04 faranaziz

The mel-spectrogram calculation code of the WaveGlow and Cotatron differs much. Therefore the sounds will be metalic even when the mel configs are identical.

You might want to try matching the mel-spectrogram code. Since training the WaveGlow from scratch won't be feasible for most users, I would recommend to train Cotatron again.

Plus, you might be interested in our recent paper: https://arxiv.org/abs/2104.00931

seungwonpark avatar Apr 15 '21 04:04 seungwonpark