VQMIVC icon indicating copy to clipboard operation
VQMIVC copied to clipboard

Mel stats and Vocoder

Open winddori2002 opened this issue 3 years ago • 2 comments
trafficstars

Hi, I try to reproduce your paper and I encounter a problem with mel stats and vocoder. When I use your pre-trained vocoder and mel stats, I can notice the speech synthesis performance is quite good. However, when I run the preprocess code and get new mel stats, the speech synthesis performance degrades on the same pre-trained vocoder. Thus, the questions are as below:

1.) I wonder if I get new mel stats, it is necessary to train the vocoder again. 2.) I wonder if you use mel stats from the preprocess code for vocoder input normalization.

Thank you

winddori2002 avatar Apr 26 '22 05:04 winddori2002

Hi, based on my experience, using the same mel stats for vocoder and VC model leads to better voice quality, so for your questions:

  1. I think that training a vocoder using the new mel stats could generate the speech with higher quality, or you can use my provided mel stats (from PWG vocoder trained by VCTK) to normalize mels for training the VC model.
  2. the mels stats for vocoder input normalization is not from preprocess code, it is from the PWG repo for preprocessing mels.

Wendison avatar Apr 27 '22 06:04 Wendison

Thank you for answering! I understand and solve it.

winddori2002 avatar Apr 27 '22 07:04 winddori2002