icefall Hybrid autoregressive transducer

Hybrid autoregressive transducer

Open desh2608 opened this issue 2 years ago • 5 comments

I was wondering if there are any existing recipes for the HAT model. It is a straightforward change by modeling the blank distribution as a Bernoulli distribution, and was shown to be useful to integrate external LMs, among other things.

Has anyone tried it in icefall, especially with the pruned loss?

Sep 24 '23 20:09 desh2608

I was wondering if there are any existing recipes for the HAT model. It is a straightforward change by modeling the blank distribution as a Bernoulli distribution, and was shown to be useful to integrate external LMs, among other things.

Has anyone tried it in icefall, especially with the pruned loss?

We have not tried that. Would be great if you can add that.

Sep 24 '23 23:09 csukuangfj

@csukuangfj Do you have advice on what would be a good evaluation setup for using HAT to integrate external LMs? For example, how did you evaluate the LODR methods?

Sep 27 '23 13:09 desh2608

For a POC, I was just training a model on LibriSpeech, and was planning to use an external RNNLM. But Dan pointed out that LibriSpeech may not be the best test-bed for these experiments.

Sep 27 '23 13:09 desh2608

@marcoyang1998

Could you have a look?

Sep 27 '23 14:09 csukuangfj

You may try cross-domain evaluation scenarios, e.g. decoding the LibriSpeech model on the Gigaspeech test sets using an RNNLM trained on the Gigaspeech transcripts. I believe I tested LODR in this scenario and it yielded better results than using only shallow fusion.

Sep 27 '23 14:09 marcoyang1998

icefall icefall copied to clipboard

Hybrid autoregressive transducer

icefall
icefall copied to clipboard