icefall icon indicating copy to clipboard operation
icefall copied to clipboard

support combining two n-gram LMs

Open csukuangfj opened this issue 2 years ago • 2 comments

csukuangfj avatar Apr 10 '23 04:04 csukuangfj

What's the use case for this? Hot word detection?

wangtiance avatar Apr 11 '23 02:04 wangtiance

What's the use case for this? Hot word detection?

Please have a look at An Empirical Study of Language Model Integration for Transducer based Speech Recognition

The above paper uses a neural LM for ILM, while we are using an n-gram for ILM.

Please also have a look at https://github.com/k2-fsa/icefall/blob/6434c8eadc0d4326e1db69824cf0e40dc9a71c8a/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L1862

We will replace https://github.com/k2-fsa/icefall/blob/6434c8eadc0d4326e1db69824cf0e40dc9a71c8a/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L1868 with an n-gram LM and merge https://github.com/k2-fsa/icefall/blob/6434c8eadc0d4326e1db69824cf0e40dc9a71c8a/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L1866 and https://github.com/k2-fsa/icefall/blob/6434c8eadc0d4326e1db69824cf0e40dc9a71c8a/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L1868 into a single n-gram LM with this pull-request.

csukuangfj avatar Apr 11 '23 02:04 csukuangfj