eesen
eesen copied to clipboard
CMVN error on TIMIT Database
Hi, i ran into a problem when training a lstm-rnn acoustic model on TIMIT database. Here are parts of my code and the corresponding result after running.There is an error, e.g., LOG (apply-cmvn:main():apply-cmvn.cc:129) Applied cepstral mean and variance normalization to 400 utterances, errors on 0 . I can't find what's going wrong. Can anyone help me? Thanks!
-----------------------------------------parts of my code--------------------------- if [ $stage -le 2 ]; then echo ===================================================================== echo " FBank Feature Generation " echo ===================================================================== fbankdir=fbank
Generate the fbank features; by default 40-dimensional fbanks on each frame
steps/make_fbank.sh --cmd "$train_cmd" --nj 32 data/data/trn exp/make_fbank/trn $fbankdir || exit 1; utils/fix_data_dir.sh data/data/trn || exit; steps/compute_cmvn_stats.sh data/data/trn exp/make_fbank/train $fbankdir || exit 1;
steps/make_fbank.sh --cmd "$train_cmd" --nj 32 data/data/dev exp/make_fbank/dev $fbankdir || exit 1; utils/fix_data_dir.sh data/data/dev || exit; steps/compute_cmvn_stats.sh data/data/dev exp/make_fbank/train $fbankdir || exit 1;
steps/make_fbank.sh --cmd "$train_cmd" --nj 32 data/data/tst exp/make_fbank/tst $fbankdir || exit 1; utils/fix_data_dir.sh data/data/tst || exit; steps/compute_cmvn_stats.sh data/data/tst exp/make_fbank/train $fbankdir || exit 1;
fi
if [ $stage -le 3 ]; then echo ===================================================================== echo " Network Training " echo =====================================================================
Specify network structure and generate the network topology
input_feat_dim=120 # dimension of the input features; we will use 40-dimensional fbanks with deltas and double deltas lstm_layer_num=3 # number of LSTM layers lstm_cell_dim=240 # number of memory cells in every LSTM layer
dir=exp/train_phn_l${lstm_layer_num}_c${lstm_cell_dim} mkdir -p $dir
target_num=cat data/lang_phn/units.txt | wc -l
; target_num=$[$target_num+1]; # #targets = #labels + 1 (the blank)
Output the network topology
utils/model_topo.py --input-feat-dim $input_feat_dim --lstm-layer-num $lstm_layer_num
--lstm-cell-dim $lstm_cell_dim --target-num $target_num
--fgate-bias-init 1.0 > $dir/nnet.proto || exit 1;
Label sequences; simply convert words into their label indices
utils/prep_ctc_trans.py data/lang_phn/lexicon_numbers.txt data/data/trn/trans "
Train the network with CTC. Refer to the script for details about the arguments
steps/train_ctc_parallel.sh --add-deltas true --num-sequence 10 --frame-num-limit 25000
--learn-rate 0.00004 --report-step 1000 --halving-after-epoch 12
data/data/trn data/data/dev $dir || exit 1;
echo ===================================================================== echo " Decoding " echo =====================================================================
decoding
for lm_suffix in sw1_tg sw1_fsh_tgpr; do
steps/decode_ctc_lat.sh --cmd "$decode_cmd" --nj 20 --beam 17.0 --lattice_beam 8.0 --max-active 5000 --acwt 0.6
data/lang_phn_${lm_suffix} data/data/tst $dir/decode_eval2000_${lm_suffix} || exit 1;
done
fi
-------------------------------result----------------------------- steps/train_ctc_parallel.sh --add-deltas true --num-sequence 10 --frame-num-limit 25000 --learn-rate 0.00004 --report-step 1000 --halving-after-epoch 12 data/data/trn data/data/dev exp/train_phn_l3_c240 feat-to-len scp:data/data/trn/feats.scp ark,t:- feat-to-len scp:data/data/dev/feats.scp ark,t:- copy-feats 'ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/data/trn/utt2spk scp:data/data/trn/cmvn.scp scp:exp/train_phn_l3_c240/train.scp ark:- |' ark,scp:/tmp/tmp.n5hv6njls3/train.ark,exp/train_phn_l3_c240/train_local.scp apply-cmvn --norm-vars=true --utt2spk=ark:data/data/trn/utt2spk scp:data/data/trn/cmvn.scp scp:exp/train_phn_l3_c240/train.scp ark:- LOG (apply-cmvn:main():apply-cmvn.cc:129) Applied cepstral mean and variance normalization to 3696 utterances, errors on 0 LOG (copy-feats:main():copy-feats.cc:100) Copied 3696 feature matrices. copy-feats 'ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/data/dev/utt2spk scp:data/data/dev/cmvn.scp scp:exp/train_phn_l3_c240/cv.scp ark:- |' ark,scp:/tmp/tmp.n5hv6njls3/cv.ark,exp/train_phn_l3_c240/cv_local.scp apply-cmvn --norm-vars=true --utt2spk=ark:data/data/dev/utt2spk scp:data/data/dev/cmvn.scp scp:exp/train_phn_l3_c240/cv.scp ark:- LOG (apply-cmvn:main():apply-cmvn.cc:129) Applied cepstral mean and variance normalization to 400 utterances, errors on 0 LOG (copy-feats:main():copy-feats.cc:100) Copied 400 feature matrices.
Not sure - this all seems fine to me? Where is the error?
Yeah maybe it's just the misleading message "errors on 0" ... even just mentioning the word 'error' can be scary in a log message :)
But the training just exited before finishing and i didn't get any model. still not figuring out what's going wrong.... the following is the print: Initializing model as exp/model_l4_c320/nnet/nnet.iter0 TRAINING STARTS [2017-Jun-15 11:38:32] [NOTE] TOKEN_ACCURACY refers to token accuracy, i.e., (1.0 - token_error_rate). EPOCH 25 RUNNING ... Removing features tmpdir exp/model_l4_c320/Y7MuG @ 311Ubuntu cv.ark train.ark
The logs show EPOCH 25 RUNNING - presumably you have models from epoch 0 (untrained) to 24 - not? Or are you for some reason starting the training at epoch 25, which means the system tries to load epoch 24, and fails?
Take a look in exp/model_l4_c320/log/ and search for the training log of iteration 25. There might be some clues in the log why the training failed. You can also try to rerun the training from the last successful epoch by rerunning the training part of your script.