kneser-ney icon indicating copy to clipboard operation
kneser-ney copied to clipboard

Kneser-Ney implementation in Python

Results 6 kneser-ney issues
Sort by recently updated
recently updated
newest added

Just a question about the log prob calc in prob interpolation. https://github.com/smilli/kneser-ney/blob/master/kneser_ney.py#L132 order[kgram] += last_order[suffix] + backoff[prefix] The original interpolation is based on normal prob instead of log prob. It...

If this language model is trained on one corpus (e.g. gutenberg) and applied to another (e.g. brown), it is very likely to encounter out of vocabulary words or unseen ngrams....

Maybe it is trivial and I am wrong. From the paper I think the count of a k-gram "word" is its occurrence in the corpus data not in its **higher-order...

you pad the incoming sequence (https://github.com/smilli/kneser-ney/blob/master/kneser_ney.py#L147), but then go and use the original tuple (not padded) for scoring

I changed the pad_symbol as left_pad_symbol, right_pad_symbol and add start_pad_symbol in KneserNeyLM, but there still another eroor. We may use log function with a negative value,but why it was negative?...

Environment: Python 3.5.2 nltk 3.2.1 reproduce step: `python3 example.py` error message: ``` Traceback (most recent call last): File "example.py", line 8, in lm = KneserNeyLM(3, gut_ngrams, end_pad_symbol='') File "/Users/username/kneser-ney/kneser_ney.py", line...