ai-research-code icon indicating copy to clipboard operation
ai-research-code copied to clipboard

[NVC-Net] About 16 kHz training and model convergence

Open Aria-K-Alethia opened this issue 2 years ago • 2 comments

Hi,

Thank you for sharing your great work!

I'm using nvcnet to train a Japanese voice conversion model, I have two questions.

First, I try to adapt your code to 16 kHz wavs, I did the following two manipulations:

  1. changed sr in hparams.py from 22050 into 16000
  2. changed segment_length in hparams.py from 32768 into 16384 The training goes well but the performance is bad even after 400 epochs.

I wonder if you have any idea on training nvcnet on 16 kHz wavs? Do I need any other modifications to ensure the training will go well ?

Second, could you share the value of g_loss_rec when the model converges?. In my training the g_loss_rec converged to around 0.9 to 1.2, I'm not sure if this is what I should expect in model convergence.

Aria-K-Alethia avatar Dec 08 '22 19:12 Aria-K-Alethia

Hi, we haven't tried to train NVC-Net on 16 kHz. Under the current hyper-parameters, g_loss_rec should be around 1.2 to1.3 In your case, it could be that other weighing terms are not appropriate for 16 kHz.

bacnguyencong-sony avatar Dec 12 '22 13:12 bacnguyencong-sony

Hi,

Thank you for your answering! Do you mean that the converged g_loss_rec value is normal in my case?

Aria-K-Alethia avatar Dec 12 '22 14:12 Aria-K-Alethia