kenlm icon indicating copy to clipboard operation
kenlm copied to clipboard

KenLM: Faster and Smaller Language Model Queries

Results 136 kenlm issues
Sort by recently updated
recently updated
newest added

Necessary files in order to build a conda package. Instructions in order to build a package have been added. A conda package has been already released: https://anaconda.org/bitextor/kenlm. The release has...

I am trying to reproduce the findings from a machine translation paper: https://github.com/pmichel31415/mtnt using Google Colab After installing packages including Kenlm, when I type: bash scripts/prepare_model.sh config/data.en.config I keep getting...

When installing KenLM on the following environment: ``` Linux pop-os 5.15.15-76051515-generic #202201160435~1642693824~21.10~97db1bb SMP Thu Jan 20 17:35:05 U x86_64 x86_64 x86_64 GNU/Linux ``` and Python version: ``` Python 3.9.7 ```...

Howdy yall. I am trying to analyze the data in the language models found here: https://bio.nlplab.org/#ngram-model I am loading the 1-gram + 2-gram data into the arpa format, everything looks...

Hello, is there any scripts or parameter that I can add to get the total perplexity including/excluding OOVs? I want to find a way to calculate the perplexity including OOVs

Hello, I'd like to compute perplexity on different text corpuses given an ngram computed with kenlm. I found in some old issues that `--vocab_pad` param should be used with a...

this allows for compatibility with python 3.10

Hi @kpu , I met a weird issue: training the n-gram model with relative small corpus was OK, but it raised baddiscount error with even more corpus 1. Training the...

Hi @kpu , I am trying to generate arpa file of text ~20gb in size. It's taking too long to generate. Initial 4 steps are relatively fast compared to Step:5...

Hi I'm trying to pip3 install kenlm on Ubuntu Focal and got the following error. When I do the same pip3 install on Ubuntu Bionic it was okay. Is it...