hifi-gan
hifi-gan copied to clipboard
learning loss explosion
First of all, thank you. Thanks to the code you provided, it was very helpful in studying TTS and Hifi-Gan.
When attempting to train using the provided code and LJSpeech original data, a learning loss explosion occurred, I've tried things like adjusting lr and other hyperparameters, but it still doesn't work, so I'd like to ask for advice.
Add some learning process and error codes below.
.... checkpoints directory : cp_hifigan Epoch: 1 Steps : 0, Gen Loss Total : 101.349, Mel-Spec. Error : 2.058, s/b : 585.139 Steps : 5, Gen Loss Total : 135.118, Mel-Spec. Error : 2.176, s/b : 0.496 Steps : 10, Gen Loss Total : 106.757, Mel-Spec. Error : 1.938, s/b : 0.435 Steps : 15, Gen Loss Total : 94.055, Mel-Spec. Error : 1.701, s/b : 0.435 Steps : 20, Gen Loss Total : 141.385, Mel-Spec. Error : 1.857, s/b : 0.433 Steps : 25, Gen Loss Total : 196.452, Mel-Spec. Error : 3.813, s/b : 0.443 Steps : 30, Gen Loss Total : 64922832.000, Mel-Spec. Error : 1.813, s/b : 0.454 Steps : 35, Gen Loss Total : 199.692, Mel-Spec. Error : 3.854, s/b : 0.455 Steps : 40, Gen Loss Total : nan, Mel-Spec. Error : 2.021, s/b : 0.443 Steps : 45, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.449 Steps : 50, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.444 Steps : 55, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.436 Steps : 60, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.449 Steps : 65, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.452 Steps : 70, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.452 Steps : 75, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.464
Above is a learning process using the basic config, and after Steps : 45, all Loss values are nan.
Below error occurs at 1000 steps.
Steps : 985, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.430
Steps : 990, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.431
Steps : 995, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.440
Steps : 1000, Gen Loss Total : nan, Mel-Spec. Error : nan, s/b : 0.427
Traceback (most recent call last):
File "train.py", line 271, in
Waiting for your reply. thank you