kenlm icon indicating copy to clipboard operation
kenlm copied to clipboard

ngrams have different scoring levels

Open raviolli opened this issue 5 years ago • 0 comments

Hi there just curious, this isn't really a bug...

How come when compared 2-grams vs 3-grams their scoring are not normalized.

The 2-grams will typically (and the majority of the time) have higher scores then the 3-grams.

This becomes problematic when trying to compare scores between 2-gram and 3-grams outputs.

Any insight would be great, perhaps with detailed explanation I can fix the issue and submit a pull.

raviolli avatar Feb 25 '20 01:02 raviolli