ParallelWaveGAN
ParallelWaveGAN copied to clipboard
HifiGAN training -- obvious harmonics in test files
I trained HifiGAN on VCTK multi-speaker datasets with 24kHz sampling rate. I also do normalization in the input log-Mel spectrogram (with mean=-4, std=4), and found obvious harmonics in the test files as below.
Have you ever met this? Any suggestions? Thank you!

BTW, the discriminators' loss is quite small, which may suggest the discriminators are too strong?

Did you use this repository? Or general question about hifigan?
Did you use this repository? Or general question about hifigan?
Hi, actually I referred to your repository and the official version. I trained several epochs by the official code and find the discriminators' losses around 0.1~0.2, but also with obvious harmonics. So I wonder if this happens in early training stages? But the discriminators' loss is quite strange...
OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice. I'm not familiar with official implementation but in my case official optimizer setting does not work well. The following issue may help you. https://github.com/kan-bayashi/ParallelWaveGAN/issues/278
OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice. I'm not familiar with official implementation but in my case official optimizer setting does not work well. The following issue may help you. #278
There seems something wrong with the discriminators. The losses get smaller after more epochs. The normal values would be around 0.1~0.2 for each discriminatory, but mine is as below.

@Kristopher-Chen have you resolved the problems?
@Kristopher-Chen have you resolved the problems?
when I refer to the original codes, this problem is solved.
For discriminator losses, the 2nd and 3rd MSD losses are easily becoming small, and others look normal.
Moreover, the feature map loss keeps growing gradually. But interestingly, the generated samples sound natural...