VoiceCraft
VoiceCraft copied to clipboard
Validation loss Divergence?
Thanks for your great work! Now, I'm training 100M voicecraft using ljspeech and custom data(32 hours Maybe ?) But, I faced a issue about validation loss divergence.
I think the cause it delay stacking which changed the sequnece every epoch described in your paper. If the train-accuracy of all 4 codebooks reaches 1, it is predicted that the validation loss will decrease..
For this reason, I have two questions.
- Could you explain whether my training is right or not ? ( loss curve and analysis etc..)
- Could you share your train and validation curve ?
Best regards
Seung Woo Yu
Thanks
Note sure about the validation loss.
The model is overfitting on your data (if there isn't significantly train/val domain mismatch): it's getting 95+% training top10acc on codebook 1, but 30% on validation set.
I might no longer have the original curve on our server. But similar experiments I've done recently show that I get at best 60% top10acc on codebook 1 on training set, and similar for validation set (a little lower than training set because train-val mismatch in gigaspeech)
maybe try a even smaller model.
@jasonppy thanks a lot for your answers. Do you have a rough idea of what a good validation loss to attain is? For example if you know roughly where you ended up on the 9k hours GigaSpeech experiment you quote in the paper. Thanks!
I still have the results for a slightly different model, but should mostly be the same: 'train_top10acc_cb1': '0.5548 (0.5261)', 'train_top10acc_cb2': '0.4790 (0.4456)', 'train_top10acc_cb3': '0.4369 (0.3947)', 'train_top10acc_cb4': '0.3694 (0.3226)' 'val_top10acc_cb1': '0.5001731514930725', 'val_top10acc_cb2': '0.4425261914730072', 'val_top10acc_cb3': '0.4081890881061554', 'val_top10acc_cb4': '0.3555351495742798'
thats super useful, thanks very much!
@yuseungwoo I am seeing almost the same results as you are describing, ie great train loss, but almost instantly overfitting on the val_loss. Were you able to solve your problems? How big is your dataset?
I still have the results for a slightly different model, but should mostly be the same: 'train_top10acc_cb1': '0.5548 (0.5261)', 'train_top10acc_cb2': '0.4790 (0.4456)', 'train_top10acc_cb3': '0.4369 (0.3947)', 'train_top10acc_cb4': '0.3694 (0.3226)' 'val_top10acc_cb1': '0.5001731514930725', 'val_top10acc_cb2': '0.4425261914730072', 'val_top10acc_cb3': '0.4081890881061554', 'val_top10acc_cb4': '0.3555351495742798'
Could you share the training curve or the accuracy related to 830M model? I tried to pretrain using my custom dataset, but it diverges in the middle of training 😂