Multi-Singer icon indicating copy to clipboard operation
Multi-Singer copied to clipboard

Nan errors when trainning

Open Robinatp opened this issue 2 years ago • 4 comments

Hello, I met the following problems in the training, as follow:

2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/embed_loss = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/spk_similariy = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/spectral_convergence_loss = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/log_stft_magnitude_loss = nan. 2022-05-12 14:56:20,956 (train:487) INFO: (Steps: 1000) train/generator_loss = nan.

Do you have any proposals for me?

Thx!

Robinatp avatar May 12 '22 07:05 Robinatp

Hi, it's weird and I haven't come across this issue. Could retrain the model using another machine solve it? :)

Rongjiehuang avatar May 13 '22 08:05 Rongjiehuang

I find some problem,maybe you should update the code , as bellow:

encoder/audio.py:171: dBFS_change = target_dBFS - 10 * np.log10(np.mean(wav ** 2)+1e-8) encoder/audio.py:180: dBFS_change = target_dBFS - 10 * torch.log10(torch.mean(wav ** 2)+1e-8)

Robinatp avatar May 20 '22 10:05 Robinatp

However, there is another new problem!

2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/embed_loss = 0.0153. 2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/spk_similariy = nan. 2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/spectral_convergence_loss = 0.2280. 2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/log_stft_magnitude_loss = 0.7044. 2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/generator_loss = 0.9629. 2022-05-20 20:32:54,063 (train:512) INFO: (Steps: 3000) train/embed_loss = 0.0155. 2022-05-20 20:32:54,064 (train:512) INFO: (Steps: 3000) train/spk_similariy = nan. 2022-05-20 20:32:54,064 (train:512) INFO: (Steps: 3000) train/spectral_convergence_loss = 0.2294. 2022-05-20 20:32:54,064 (x2num:14) WARNING: NaN or Inf found in input tensor.2022-05-20 20:32:54,064 (train:512) INFO: (Steps: 3000) train/log_stft_magnitude_loss = 0.7057. 2022-05-20 20:32:54,064 (train:512) INFO: (Steps: 3000) train/generator_loss = 0.9660.

Robinatp avatar May 20 '22 12:05 Robinatp

I'm facing the same situation as you, wondering if you have solved it. Thanks.

qiao131 avatar Oct 10 '22 09:10 qiao131