ParallelWaveGAN HifiGAN training -- obvious harmonics in test files

HifiGAN training -- obvious harmonics in test files

Open Kristopher-Chen opened this issue 3 years ago • 7 comments

I trained HifiGAN on VCTK multi-speaker datasets with 24kHz sampling rate. I also do normalization in the input log-Mel spectrogram (with mean=-4, std=4)， and found obvious harmonics in the test files as below. Have you ever met this? Any suggestions? Thank you!

Mar 14 '22 03:03 Kristopher-Chen

BTW, the discriminators' loss is quite small, which may suggest the discriminators are too strong?

Mar 14 '22 03:03 Kristopher-Chen

Did you use this repository? Or general question about hifigan?

Mar 14 '22 11:03 kan-bayashi

Did you use this repository? Or general question about hifigan?

Hi, actually I referred to your repository and the official version. I trained several epochs by the official code and find the discriminators' losses around 0.1~0.2, but also with obvious harmonics. So I wonder if this happens in early training stages? But the discriminators' loss is quite strange...

Mar 15 '22 03:03 Kristopher-Chen

OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice. I'm not familiar with official implementation but in my case official optimizer setting does not work well. The following issue may help you. https://github.com/kan-bayashi/ParallelWaveGAN/issues/278

Mar 15 '22 05:03 kan-bayashi

OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice. I'm not familiar with official implementation but in my case official optimizer setting does not work well. The following issue may help you. #278

There seems something wrong with the discriminators. The losses get smaller after more epochs. The normal values would be around 0.1~0.2 for each discriminatory, but mine is as below.

Apr 06 '22 09:04 Kristopher-Chen

@Kristopher-Chen have you resolved the problems?

Feb 22 '23 09:02 MlWoo

@Kristopher-Chen have you resolved the problems?

when I refer to the original codes, this problem is solved.

For discriminator losses, the 2nd and 3rd MSD losses are easily becoming small, and others look normal.

Moreover, the feature map loss keeps growing gradually. But interestingly, the generated samples sound natural...

May 11 '23 01:05 Kristopher-Chen

ParallelWaveGAN ParallelWaveGAN copied to clipboard

HifiGAN training -- obvious harmonics in test files

ParallelWaveGAN
ParallelWaveGAN copied to clipboard