icefall
icefall copied to clipboard
How to get the backoff-id from LG graph ?
I want to use LM-GRAM fromhttps://github.com/k2-fsa/icefall/blob/b293db4baf1606cfe95066cf28ffde56173a7ddb/icefall/ngram_lm.py#L27
But don't know how to get backoff-id when building LG graph.
L: Subword.
G: n-Gram Model.
Build from https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang_bpe.py & https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/compile_lg.py
I assume you are rerferring to the G graph, right?
The backoff ID is the ID of #0
.
If your G is a token-level G, then #0
should come from tokens.txt
.
If your G is a word-level G, then #0
is from words.txt
.
@csukuangfj Thank you for supper quick reply. I used the LG gram with L build from subword and G is n-gram model. I follow with 3 step:
- Get N-gram Graph with generate-lm.sh.
- Get L graph with file https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang_bpe.py
- Compile L and G So in L I have a #0 is 1024 and in G I have a #0 is 5741. I confused what the right value for backoff-id for NGram(LG-graph). Thanks
By the way, we only use the argument --backoff-id
for G used in shallow fusion.
What is the use of backoff ID for an LG graph?