icefall
icefall copied to clipboard
[Not for merge] Add Repeat-K mechanism to Zipformer
Keeping a separate vector as an nn.Parameter in the model, and adding it times k/(1+k) to the embedding before the ReLU during training and decoding.
It only supports greedy_search and greedy_search_batch.
Result on LibriSpeech 100h
pruned_transducer_stateless7 -> pruned_transducer_stateless9
| test-clean | test-other | config |
|---|---|---|
| 6.48 -> 6.64 | 17.03 -> 17.50 | epoch 30 avg 1 |
| 6.43 -> 6.49 | 16.65 -> 16.85 | epoch 30 avg 2 |
| 6.31 -> 6.32 | 16.57 -> 16.53 | epoch 30 avg 3 |
| 6.31 -> 6.28 | 16.39 -> 16.23 | epoch 30 avg 4 |
| 6.26 -> 6.23 | 16.22 -> 16.01 | epoch 30 avg 5 |
| 6.22 -> 6.18 | 16.13 -> 15.89 | epoch 30 avg 6 |
| 6.24 -> 6.15 | 16.03 -> 15.80 | epoch 30 avg 7 |
| 6.17 -> 6.13 | 15.96 -> 15.78 | epoch 30 avg 8 |
| 6.11 -> 6.07 | 15.91 -> 15.79 | epoch 30 avg 9 |
| 6.10 -> 6.03 | 15.88 -> 15.79 | epoch 30 avg 10 |
| 6.10 -> 6.03 | 15.81 -> 15.73 | epoch 30 avg 11 |
| 6.06 -> 6.02 | 15.77 -> 15.78 | epoch 30 avg 12 |
| 6.08 -> 6.04 | 15.77 -> 15.81 | epoch 30 avg 13 |
| 6.05 -> 6.05 | 15.79 -> 15.83 | epoch 30 avg 14 |
| 6.02 -> 6.07 | 15.81 -> 15.80 | epoch 30 avg 15 |
pruned_transducer_stateless9
| test-clean & test-other | sum | config |
|---|---|---|
| 6.03 & 15.73 | 21.76 | epoch 30 avg 11 |
| 6.02 & 15.78 | 21.80 | epoch 30 avg 12 |
| 6.03 & 15.79 | 21.82 | epoch 30 avg 10 |
| 6.04 & 15.81 | 21.85 | epoch 30 avg 13 |
| 6.07 & 15.79 | 21.86 | epoch 30 avg 9 |
| 6.07 & 15.8 | 21.87 | epoch 30 avg 15 |
| 6.05 & 15.83 | 21.88 | epoch 30 avg 14 |
| 6.06 & 15.82 | 21.88 | epoch 30 avg 16 |
| 6.13 & 15.78 | 21.91 | epoch 30 avg 8 |
| 6.05 & 15.86 | 21.91 | epoch 30 avg 17 |
| 6.05 & 15.87 | 21.92 | epoch 30 avg 18 |
| 6.15 & 15.8 | 21.95 | epoch 30 avg 7 |
| 6.1 & 15.94 | 22.04 | epoch 30 avg 19 |
| 6.18 & 15.89 | 22.07 | epoch 30 avg 6 |
| 6.17 & 16.01 | 22.18 | epoch 30 avg 20 |
| 6.23 & 16.01 | 22.24 | epoch 30 avg 5 |
| 6.28 & 16.23 | 22.51 | epoch 30 avg 4 |
| 6.32 & 16.53 | 22.85 | epoch 30 avg 3 |
| 6.49 & 16.85 | 23.34 | epoch 30 avg 2 |
| 6.64 & 17.5 | 24.14 | epoch 30 avg 1 |