Tacotron-2 icon indicating copy to clipboard operation
Tacotron-2 copied to clipboard

Loss Exploded?

Open dalvlv opened this issue 5 years ago • 6 comments

When I train Wavenet with LJspeech , it always occurred that "Loss Exploded" even if I continue to train with the checkpoint. I just use the master code and don't know why it happened. Any one know it and help me?

dalvlv avatar Jul 15 '19 03:07 dalvlv

I have the same problem as yours,and the loss is very unsteady.Hopefully someone can figure it out

he804583359 avatar Jul 27 '19 02:07 he804583359

There is a similar issue when training on LibriTTS

maelp avatar Jul 30 '19 19:07 maelp

I'm training on the real mel-spectrogram, and without global conditioning, no GTA

maelp avatar Jul 30 '19 19:07 maelp

I changed input_type to "mulaw-quantize", quantize_channels and out_channels to 256, and redid GTA and preprocess. Now the loss exploded problem is gone. But the eval result is still very bad after 115K steps. step-115000-waveplot (I'm working on Chinese language)

JasonWei512 avatar Jul 30 '19 23:07 JasonWei512

I'm experiencing the same thing, any news on this?

a-froghyar avatar May 16 '20 14:05 a-froghyar

I've actually found that if you revert to the one checkpoint before you got stuck and the loss exploded every time you re-run training, the old checkpoint will continue without a problem - this might have to be done every time you get stuck continuously.

In my case, I just changed the numbers in the checkpoint file and deleted the most recent checkpoint data and index files that caused the stuck.

a-froghyar avatar May 19 '20 15:05 a-froghyar