deepspeech icon indicating copy to clipboard operation
deepspeech copied to clipboard

Validation result for a single file is not stable

Open duongquangduc opened this issue 7 years ago • 4 comments

I trained Librispeech data set of 960 hours for a week and got WER at 38% and CER at 12% with the validation data set dev-clean. My issue is, when I verified a single audio file randomly in the validation data set, the output result is quite different compared to the result evaluated from the whole data set.

For example, the transcript of audio file 84-121123-0007.flac is 'WHAT DO YOU MEAN SIR'. The transcript when evaluated from one sample 84-121123-0007.flac is 'AP WHA E MA EMSIR ', cer is 70% and wer is 100% , while the one from dev-clean set is 'WHAT DOYOU MEAN SIR'.

Could you please suggest?

duongquangduc avatar Aug 02 '17 04:08 duongquangduc

Please post the exact command you used to train the model so that we can help diagnose.

Neuroschemata avatar Aug 08 '17 20:08 Neuroschemata

@Neuroschemata, This is the statement to train the model: python train.py --manifest train:/root/deepspeech/librispeech/train-clean-100/1000_hour_manifest.csv --manifest val:/root/deepspeech/librispeech/train-clean-100/val-manifest.csv -e7 -z32 -s /deepspeech/speech/model_ds2.pkl --model_file /deepspeech/speech/model_ds2.pkl

duongquangduc avatar Aug 09 '17 07:08 duongquangduc

This issue is likely related to a bug in model serialization: https://github.com/NervanaSystems/neon/issues/359. We are working on a fix and will give you instructions on how to update when we get one out. Thanks for catching it!

tyler-nervana avatar Aug 09 '17 22:08 tyler-nervana

Hi @tyler-nervana, Is there any update for this request? Btw, do I need to retrain a new model when it is updated? Thanks!

duongquangduc avatar Sep 22 '17 10:09 duongquangduc