Kenneth Heafield
Kenneth Heafield
Unable to reproduce. ``` heafield@meili:~/kenlm/build$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=20.04 DISTRIB_CODENAME=focal DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS" heafield@meili:~/kenlm/build$ uname -a Linux meili 5.4.0-74-generic #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021 x86_64 x86_64 x86_64...
Does 5cea457 fix this? @Domhnall-Liopa Thanks for the debugging tip! That will cause a memory leak though so I'm hesitant to use that.
Smells like you have the headers but not the libraries?
Would you mind repeating the last 8 words of the error message?
The file /content/mtnt/models/corpus.en.1000.bpe.lm.5.bin does not exist. As to why it doesn't exist, that's really a question for the author of `scripts/prepare_model.sh`.
No, unigrams create corner cases (the highest order is the same as the lowest order) so I haven't implemented them.
This is not good. Can I get a gdb stack trace? Compile with debugging: ```bash cmake -DCMAKE_BUILD_TYPE=Debug .. make -j 4 cd /whereever/you/were/running #I think that's where you are? gdb...
I'd be happy to have this feature added from contributors like you. There is the beginning of such support through this program: https://github.com/kpu/kenlm/blob/bounded-noquant/lm/builder/just_count_main.cc which can be run as independent processes...
The underlying C++ code is threadsafe (and for that matter the mmap can share memory). I don't know enough about python though.
https://cmake.org/cmake/help/latest/module/ExternalProject.html