icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Add gradient filter for tdnn_lstm_ctc

Open yaozengwei opened this issue 3 years ago • 5 comments

This PR adds the gradient filter for tdnn_lstm_ctc recipe. You could see https://github.com/k2-fsa/icefall/pull/564 for details.

yaozengwei avatar Sep 05 '22 14:09 yaozengwei

@huangruizhe you can see whether this resolves your problem.

danpovey avatar Sep 06 '22 01:09 danpovey

Hi, I've tested on the tdnn_lstm_ctc2 recipe with grad_norm_threshold =100, but I think the model behaves similarly to the one before adding the gradient filter -- the model diverges when the learning rate takes 1e-3, as in the default recipe, while starts to converge when lr=1e-4..

Here is the tensorboard:

  1. Running this recipe directly (with grad_norm_threshold =100): tdnn_lstm_ctc2/train.py tensorboard

  2. Running the above configuration, and shuffling the whole librispeech train cuts. tensorboard

  3. The recipe before adding the gradient filter, and shuffling the whole librispeech train cuts: tdnn_lstm_ctc/train.py tensorboard

huangruizhe avatar Sep 09 '22 01:09 huangruizhe

It will be hard to diagnose what's really going on here without looking at the diagnostics files (obtained by starting from intermediate epochs and adding the flag --print-diagnostics=True.. should take about 5 minutes).

danpovey avatar Sep 09 '22 02:09 danpovey

This recipe does not support using flag --print-diagnostics=True.

yaozengwei avatar Sep 09 '22 02:09 yaozengwei

Ruizhe can figure out how to add the code from other recipes, and make a PR.

danpovey avatar Sep 09 '22 02:09 danpovey