icefall
icefall copied to clipboard
Very high WER when I tried to use bpe_500 lexicon on tdnn_lstm_ctc model
Yes, I know tdnn-lstm is kinda outdated but it's easier to fit it to our edge device, so I thought it might be worth trying. When I trained it the loss soon stopped decreasing, and decode.py produced WER close to 100, with hyps almost always empty. My first guess was that 500 labels are too many for tdnn-lstm, and generated a bpe model with 100 tokens. This time it is able to train but WER is still higher than using phone lexicon. I'm not sure if there's something wrong with my code, or bpe_500 just doesn't work for tdnn-lstm.
The code that I used to train and decode is in my repo: https://github.com/wangtiance/icefall/tree/master/egs/librispeech/ASR/tdnn_lstm_ctc
log-train-2022-07-25-17-04-51-0.txt recogs-test-clean-no_rescore.txt
I think there are some problems in your training codes. Your hyps are empty. You can print the ids of the output. Look at what they are. I suspect that your text labels (or text ids) are wrong in your training. You can print the y after y = sp.encode(texts, out_type=int) in function compute_loss of train.py .
I think there are some problems in your training codes. Your hyps are empty. You can print the ids of the output. Look at what they are. I suspect that your text labels (or text ids) are wrong in your training. You can print the
yaftery = sp.encode(texts, out_type=int)in functioncompute_lossof train.py .
Thanks for the suggestion. Will look into that.