Yun Wang (Maigo)

Results 34 comments of Yun Wang (Maigo)

Search for "Pretrained Models" on the demo website (https://projects.csail.mit.edu/soundnet/).

I get different results from both you and KenLM, but I believe KenLM is making a mistake here. What I got with KenLM on your corpus: ``` -0.6726411 went -0.022929981...

There are two other causes for the discrepancy: 1. KenLM does not include `` when calculating the vocabulary size, while your program does. I think KenLM's approach makes more sense...

Another example on which KenLM miscalculates `s.n[1]` and `s.n[2]` can be found in #405. In that example, this affects the discounts, and the probabilities in the final LM.