Yajie Miao comments

Results 16 comments of


                                            Yajie Miao

DNN Model and Configuration

You can open nnet.mdl directly. Depending on the model size, loading the file may take some time. nnet.cfg is pickle-formatted dump of network configuration (a class). You can load it...

hidden layers outputs

you can do it with cmds/run_Extract_Feats.py

Timit fbank result is ok? and how to add some features such as delta-delta?

Word-based language model built on TIMIT is relatively weak. I recommend you to compose a phone language model. You plug a fake dictionary which simply contains the duplicates of phones:...

Timit fbank result is ok? and how to add some features such as delta-delta?

yep My very first verification of EESEN was done on TIMIT. I was able to get reasonable (if not state-of-the-art) phone error rates

Timit fbank result is ok? and how to add some features such as delta-delta?

In general, CTC highly depends on BiLSTM for reasonable performance. If you refer to http://www.cs.cmu.edu/~ymiao/pub/icassp2016_ctc.pdf, on Switchboard, Uni-directional models perform >15% worse than Bi-directional models, with the same number of...

Token Accuracy Drops Obj(log[Pzx])=nan

If it still fails, try to reduce --frame-num-limit as it affects the deltas of parameters.

Is it possible to use warp-ctc instead?

We are studying the validity of the comparisons they presented, specifically the linear correlation between time and N for Eesen.

tedlium char based training scripts

The previous tedlium char recipe does not work. I just made changes. It should work now. I am validating it on my side.

I runned eesen on gridengine cluster only feature extraction and decoding runned on the cluster

You can submit the running of train_ctc_parallel.sh (for example https://github.com/srvk/eesen/blob/master/asr_egs/wsj/run_ctc_phn.sh#L75) to the scheduler Alternatively, in train_ctc_parallel.sh, you can modify it by following https://github.com/srvk/eesen/blob/master/asr_egs/wsj/steps/train_ctc_parallel_h.sh#L141

I runned eesen on gridengine cluster only feature extraction and decoding runned on the cluster

You should set it to the number of jobs (in your case just 1), instead of the number of GPUs. When you set it to 3, the script will submit...