Kenneth Heafield

Results 290 comments of Kenneth Heafield

This is cutting it close especially if you have other things loaded; I would recommend converting the original ARPA to binary format again with better compression options https://neural.mt/code/kenlm/structures/. You might...

Run the `filter` program. It will print a help message with more command line documentation. `bin/filter vocab model:in.arpa out.arpa

Use only one of vocab: or model: (the other is on stdin). Also, it's not a csv, it's whitespace-delimited tokens.

Not sure why our cythons generate different output. Does 0760f4c4df76f3286656e7232dc3ad6495248bc2 work for you?

There is no fast path for scoring the entire vocabulary in a given context. A forward trie is more optimal for that. KenLM implements a reverse trie to optimize individual...

Did you install the dependencies documented on https://kheafield.com/code/kenlm/dependencies/ ? I smell a missing `libboost-all-dev`.

@cheahheng Definitely the problem is you don't have the full repo just the inference stuff. Get it from this repo.

It looks like you're trying to compile `dump_trie_main.cc` on its own (the command line was cut off from the screenshot). I'd recommend using bjam for this (since it's the old...

I smell compilation with a different version of Boost than is installed as a shared library on the system.

Would a callback from LoadVirtual be sufficient?