deepspeech
deepspeech copied to clipboard
nan cost
hi, I'm getting nan cost after resuming the training for the pre-trained model (librispeech_16_epochs.prm). the cost becomes nan after 16/17 epoch and the testing results (after each epoch) are null.
OS: Ubuntu 16.04 GPU: Nvidia Titan-X Pascal (12GB RAM) Neon: version 1.9.0
Could you share a bit more details on your setup? We haven't seen this behavior. What is the command you are running to train further? Which dataset are you using? Is there anything different about your data from the librispeech dataset?
I am getting the same problem. My audio data are in wav format other than flac. Is this a problem?
following is my command:
python train.py --manifest train:data/train_1700hour.csv --manifest val:data/dev_1700hour.csv -e 20 -z 12 -s model/ds2_1700hour_20_epochs.prm --model_file model/librispeech_16_epochs.prm
My transcription files have '\n' in the file, which leads to nan cost problem.
Thanks for the quick update. Currently anything in the transcript files is treated as a character, including "\n".
I also get the same problem, when .wav files are used. When I converted the files to flac files then the nan value problem did not appear.
Thanks for noticing the difficulty with .wav files. We'll take a look.
hello, i write here because i encountered a problem with nan cost as well. I am using Neon 2.0 for python 2.7 on Ubuntu 16.04 using GTX1080 backend.
in my case i am using librispeech train-500-other and after 50-60% of the epoch the cost becomes nan. i have tried training the model only using the other libispeech packages and it trains as expected. any thoughts on this?
i was able to fix the issue by dropping the learning rate of 2 order of magnitude, the issue was apparently due to an infinite cost caused by a prediction being too certain of a very wrong value.