icefall
icefall copied to clipboard
Add gradient filter for tdnn_lstm_ctc
This PR adds the gradient filter for tdnn_lstm_ctc recipe. You could see https://github.com/k2-fsa/icefall/pull/564 for details.
@huangruizhe you can see whether this resolves your problem.
Hi, I've tested on the tdnn_lstm_ctc2 recipe with grad_norm_threshold =100, but I think the model behaves similarly to the one before adding the gradient filter -- the model diverges when the learning rate takes 1e-3, as in the default recipe, while starts to converge when lr=1e-4..
Here is the tensorboard:
-
Running this recipe directly (with
grad_norm_threshold =100): tdnn_lstm_ctc2/train.py tensorboard -
Running the above configuration, and shuffling the whole librispeech train cuts. tensorboard
-
The recipe before adding the gradient filter, and shuffling the whole librispeech train cuts: tdnn_lstm_ctc/train.py tensorboard
It will be hard to diagnose what's really going on here without looking at the diagnostics files (obtained by starting from intermediate epochs and adding the flag --print-diagnostics=True.. should take about 5 minutes).
This recipe does not support using flag --print-diagnostics=True.
Ruizhe can figure out how to add the code from other recipes, and make a PR.