snowfall icon indicating copy to clipboard operation
snowfall copied to clipboard

[WIP] Add iterated loss

Open zhu-han opened this issue 3 years ago • 0 comments

This PR implements iterated loss from https://github.com/k2-fsa/snowfall/issues/179#issuecomment-830565127. Reference: https://arxiv.org/pdf/1910.10324.pdf

The following results could be reproduced with:

python mmi_att_transformer_train.py --world-size 2 --full-libri 0 --use-ali-model 0 --max-duration 250 --iterated-layers 5 --iterated-scale 0.3

Results with different iterated scale are shown in Table 1, it doesn't show clear improvement now.

  • Table 1
iterated scale test-clean test-other test-clean (rescore) test-other (rescore)
- 6.74 17.18 5.63 14.92
- 6.78 17.49 5.76 15.31
0.01 6.71 17.34 5.6 14.86
0.05 6.58 17.35 5.69 15.06
0.30 6.57 17.6 5.61 15.38
1.00 6.77 17.69 5.8 15.58
10.00 6.93 18.31 5.88 16.22

The first two lines are baseline results with no iterated loss. I run it twice to see the randomness of results.

Details:

  • It adds an extra mmi loss after the 6th conformer layer. Also tried adding after both 4th and 8th layers, the results are similar, shown in Table 2.
  • The weight of the bigram lm in mmi loss is not updated using the extra mmi loss. The comparison with the other way is shown in Table 3.

Extra results:

  • Table 2 (Add extra mmi losses after 4th and 8th layers)
iterated scale test-clean test-other test-clean (rescore) test-other (rescore)
- 6.74 17.18 5.63 14.92
- 6.78 17.49 5.76 15.31
0.30 6.61 17.65 5.65 15.52
1.00 6.66 18.13 5.68 15.87
10.00 6.75 18.43 5.74 16.25
  • Table 3 (Update bigram using extra mmi loss or not)
model test-clean test-other test-clean (rescore) test-other (rescore)
- 6.57 17.6 5.61 15.38
+ update bigram with extra loss 6.75 17.77 5.69 15.53

zhu-han avatar May 12 '21 09:05 zhu-han