SpeechSplit The validation loss is rising

The loss of my training set looks normal, but the loss of the validation set has been rising. The loss of my training set looks normal, but the loss of the validation set has been rising. The structure of the validation set is: [speaker, speaker_onehot, (spmel, raptf0, len, chapter)], spmel and raptf0 were extracted by make_spect_f0.py directly. Is there any problem with this?

I tried several times and the loss of validation set is rising.

Oct 01 '21 04:10 3139725181

Sounds like overfitting.

Oct 01 '21 13:10 auspicious3000

This doesn't look like overfitting, is the structure of my validation set correct?

[speaker, speaker_onehot, (spmel, raptf0, len, chapter)], spmel and raptf0 were extracted by make_spect_f0.py directly.

Oct 01 '21 14:10 3139725181

The structure of the validation set does not matter. Just make sure the input to the model is correct.

Oct 01 '21 14:10 auspicious3000

Yes, I want to confirm that both mel and raptf0 are extracted directly through make_spect_f0.py and used as input? Or did you do some processing?

Oct 01 '21 15:10 3139725181

Yes. Same as training

Oct 01 '21 19:10 auspicious3000

Hello. Did you figure out the issue? I have kinda similar issue where my validation loss isn't getting smaller. It fluctuates around 180-200. Only my training loss for Generator (G) keeps getting smaller, while the training loss for P fluctuates around 0.01 - 0.02.

I've trained the model on speech commands data sets which is 1 word data set of 1 second. Could it be that SpeechSplit won't perform well on such data?

Dec 15 '21 01:12 AShoydokova

@AShoydokova 1. Your training set may be too small. 2. validation setup should be consistent with training

Dec 15 '21 02:12 auspicious3000

@AShoydokova 2. validation setup should be consistent with training

Gotcha. Yes, my data is small and I trained even on smaller subset of it to get quick results.

Could you elaborate on the point 2? I've created validation as 0.10 of total data and only marked data point as validation if the speaker was already in the training data. Should I consider more things? Thank you again for the model and quick responses!

Dec 15 '21 02:12 AShoydokova

@AShoydokova 2. validation setup should be consistent with training

Gotcha. Yes, my data is small and I trained even on smaller subset of it to get quick results.

Could you elaborate on the point 2? I've created validation as 0.10 of total data and only marked data point as validation if the speaker was already in the training data. Should I consider more things? Thank you again for the model and quick responses!

I also use VCTK dataset same as paper , but get the rising validation loss before. And I found that just concatenate the multiple wavs into a longer wav can slove this problem. For training set ,one speaker finally has one longer wav , like the demo training data. I think that some operation in dataloader cause this situation , you can see data_loader.py So,maybe you can try to concatenate your training data to a longer one , and train it again.

Dec 25 '21 07:12 ZZdozeoff

@AShoydokova 2. validation setup should be consistent with training

Gotcha. Yes, my data is small and I trained even on smaller subset of it to get quick results. Could you elaborate on the point 2? I've created validation as 0.10 of total data and only marked data point as validation if the speaker was already in the training data. Should I consider more things? Thank you again for the model and quick responses!

I also use VCTK dataset same as paper , but get the rising validation loss before. And I found that just concatenate the multiple wavs into a longer wav can slove this problem. For training set ,one speaker finally has one longer wav , like the demo training data. I think that some operation in dataloader cause this situation , you can see data_loader.py So,maybe you can try to concatenate your training data to a longer one , and train it again.

Hello, I am using the demo.pkl file provided in the code for my validation set, which has only 2 voices in it and the loss keeps going up during training. Could you please tell me if you have made changes to this part? Can you share your code? Or share the code and hyperparameter settings of the solver part. Thank you very much!

Mar 03 '23 01:03 9527950

SpeechSplit SpeechSplit copied to clipboard

The validation loss is rising

SpeechSplit
SpeechSplit copied to clipboard