generalized-language-modeling-toolkit icon indicating copy to clipboard operation
generalized-language-modeling-toolkit copied to clipboard

Standard Formats

Open renepickhardt opened this issue 11 years ago • 2 comments

the output of language models and n-grams should follow standard formats e.g.

weighted finite state transducer format (WFST) ARPA format

currently I am not sure if more formats exist. they should be researched and implemented.

renepickhardt avatar Mar 08 '14 10:03 renepickhardt

In our experimentation we found that we won't be able to conform with standard formats (namely APRA) I guess we won't have this for stable release then?

lschmelzeisen avatar Jan 06 '15 14:01 lschmelzeisen

yes our current tool will not provide an arpa file since a good decoder is shipped. I would still leave this bug open (maybe move it to a different milestone) since saving our model as an FST might still be an option also for saving space and improving runtime

renepickhardt avatar Jan 06 '15 14:01 renepickhardt