icefall
icefall copied to clipboard
Hybrid autoregressive transducer
I was wondering if there are any existing recipes for the HAT model. It is a straightforward change by modeling the blank distribution as a Bernoulli distribution, and was shown to be useful to integrate external LMs, among other things.
Has anyone tried it in icefall, especially with the pruned loss?
I was wondering if there are any existing recipes for the HAT model. It is a straightforward change by modeling the blank distribution as a Bernoulli distribution, and was shown to be useful to integrate external LMs, among other things.
Has anyone tried it in icefall, especially with the pruned loss?
We have not tried that. Would be great if you can add that.
@csukuangfj Do you have advice on what would be a good evaluation setup for using HAT to integrate external LMs? For example, how did you evaluate the LODR methods?
For a POC, I was just training a model on LibriSpeech, and was planning to use an external RNNLM. But Dan pointed out that LibriSpeech may not be the best test-bed for these experiments.
@marcoyang1998
Could you have a look?
You may try cross-domain evaluation scenarios, e.g. decoding the LibriSpeech model on the Gigaspeech test sets using an RNNLM trained on the Gigaspeech transcripts. I believe I tested LODR in this scenario and it yielded better results than using only shallow fusion.