snowfall [WIP] Add iterated loss

[WIP] Add iterated loss

Open zhu-han opened this issue 3 years ago • 0 comments

This PR implements iterated loss from https://github.com/k2-fsa/snowfall/issues/179#issuecomment-830565127. Reference: https://arxiv.org/pdf/1910.10324.pdf

The following results could be reproduced with:

python mmi_att_transformer_train.py --world-size 2 --full-libri 0 --use-ali-model 0 --max-duration 250 --iterated-layers 5 --iterated-scale 0.3

Results with different iterated scale are shown in Table 1, it doesn't show clear improvement now.

Table 1

iterated scale	test-clean	test-other	test-clean (rescore)	test-other (rescore)
-	6.74	17.18	5.63	14.92
-	6.78	17.49	5.76	15.31
0.01	6.71	17.34	5.6	14.86
0.05	6.58	17.35	5.69	15.06
0.30	6.57	17.6	5.61	15.38
1.00	6.77	17.69	5.8	15.58
10.00	6.93	18.31	5.88	16.22

The first two lines are baseline results with no iterated loss. I run it twice to see the randomness of results.

Details:

It adds an extra mmi loss after the 6th conformer layer. Also tried adding after both 4th and 8th layers, the results are similar, shown in Table 2.
The weight of the bigram lm in mmi loss is not updated using the extra mmi loss. The comparison with the other way is shown in Table 3.

Extra results:

Table 2 (Add extra mmi losses after 4th and 8th layers)

iterated scale	test-clean	test-other	test-clean (rescore)	test-other (rescore)
-	6.74	17.18	5.63	14.92
-	6.78	17.49	5.76	15.31
0.30	6.61	17.65	5.65	15.52
1.00	6.66	18.13	5.68	15.87
10.00	6.75	18.43	5.74	16.25

Table 3 (Update bigram using extra mmi loss or not)

model	test-clean	test-other	test-clean (rescore)	test-other (rescore)
-	6.57	17.6	5.61	15.38
+ update bigram with extra loss	6.75	17.77	5.69	15.53

May 12 '21 09:05 zhu-han

snowfall snowfall copied to clipboard

[WIP] Add iterated loss

Details:

Extra results:

snowfall
snowfall copied to clipboard