meta language model estimation

language model estimation

Open smassung opened this issue 9 years ago • 2 comments

language_model needs the ability to estimate from a corpus instead of requiring a .arpa file

Sep 11 '15 00:09 smassung

We should consider this formulation (scroll down for the actual paper): http://homepages.inf.ed.ac.uk/s0562315/progs/#pldlm

Kenneth even mentions it in his thesis as future work. It looks like it isn't much harder than modified interpolated knesser-ney while giving better perplexity.

Sep 18 '15 04:09 skystrife

Once we implement this, we should have some way of saving the tokenization setup to ensure queries using the LM are tokenized the same way.

Nov 20 '15 17:11 smassung

meta meta copied to clipboard

language model estimation

meta
meta copied to clipboard