pocolm
pocolm copied to clipboard
maxent LMs
Another issue for anyone who's watching this project: it would be nice, as an additional baseline for the paper, to try maxent LMs. Can someone figure out how to do this on, say, Switchboard or tedlium?
... I think the latest version of SRILM supports them, and they're supposed to be a little better than regular Kneser-Ney LMs.
FYI, on a news 1.5GB corpus, I get: Order 3 Order 4 srilm size ppl size ppl Unpruned 767,2 92,18 2071,7 66,86 Maxent 702,1 97,09 1952,8 70,33
not that good then
I don't really understand what you are saying here, can you please format more clearly and use the English standard for decimals i.e. dot not comma?
I found the reason for the crash with 4-gram pruning you found before- it's about states with no counts being discarded when we need to keep the discount amount- and the fix is not a one-liner, I'll work on it today. It would affect even the un-pruned perplexities.
Dan
On Thu, Jun 30, 2016 at 12:21 PM, vince62s [email protected] wrote:
FYI, on a news 1.5GB corpus, I get: Order 3 Order 4 srilm size ppl size ppl Unpruned 767,2 92,18 2071,7 66,86 Maxent 702,1 97,09 1952,8 70,33
not that good then
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/issues/12#issuecomment-229761651, or mute the thread https://github.com/notifications/unsubscribe/ADJVuwZZqvFcnck5euySeiBFxzn15AT6ks5qRBcrgaJpZM4IuAMi .
yeah sorry copy paste from Excel. Order 3 srilm standard size=767.2 MB - ppl=92.18 srilm maxent size=702.1 MB - ppl=97.09 Order 4 srilm standard size=2071.7 MB - ppl=66.86 srilm maxent size=1952.8 MB - ppl=70.33
The corpus is "French news shuffle 2014" about 1.5 GB text file, I took out 10k sentences for a dev set. Just for info the order 4 Maxent run took 2.5 hours and up to 70GB of ram....
what I am trying to say here is that these results are somehow surprising, because when I ran it on the cantab-tedlium text corpus (entropy filtered) maxent gave better results. But then I read Tanel's paper on Maxent, and improvements were not so obvious.