kenlm icon indicating copy to clipboard operation
kenlm copied to clipboard

error while generating arpa from trie

Open nmabhi opened this issue 7 years ago • 2 comments

Could you please provide information on what all arguments have to be passed to dump_trie_main.cc to convert the trie to .arpa. TIA

nmabhi avatar Jan 11 '18 11:01 nmabhi

image

Also I got this error when I ran dump_trie_main.cc with "path to trie file " and " path to the location where I want to create arpa" as my arguments.There seems to be an error in quantisation file.

nmabhi avatar Jan 11 '18 11:01 nmabhi

It looks like you're trying to compile dump_trie_main.cc on its own (the command line was cut off from the screenshot). I'd recommend using bjam for this (since it's the old version of the repo). Currently it's missing -DKENLM_MAX_ORDER=5 but even when you get that .cc file to compile it will also need to link against a much more .cc files.

This commit is probably best for dumping the whole trie: https://github.com/kpu/kenlm/blob/76ed9775559c21175bf472f43d7a0445abc141d0/lm/dump_trie_main.cc Check out the repo at that revision (it's in branch bounded-noquant). Change the trie type on line 78 https://github.com/kpu/kenlm/blob/76ed9775559c21175bf472f43d7a0445abc141d0/lm/dump_trie_main.cc#L78 to the type of model you have. Compile with bjam.

Arguments are trie file and a prefix for the files it will generate. It will generate separate files for each order (1 2 3 4 5 etc) which can then be massaged into an ARPA file with some glue and cat.

kpu avatar Jan 18 '18 17:01 kpu