langid.py
langid.py copied to clipboard
Stand-alone language identification system
Hello, It seems that `langid` uses multiprocessing under the hood to make classification faster. Is there any way in Python to force `langid` to use a single process (turn off...
The text `"Bestias desagradables. No sé por qué acepté apostar"` should be classified as Spanish, but it is instead classified as Portuguese with high confidence. If you remove the accents,...
when l detect ”Hello China" print(langid.classify(”Hello China")) the result : ('it', -37.309250354766846) @Paczesiowa @pquentin @martinth @jnothman @saffsd
I have the following text that is a mix of `english` and `sesotho`: ``` >>>Ska rebona re phela\nKgale re sokola rona re phelela mmino\nO skang potja ka dilo\nKgale re sokola...
Hi, I just stumbled over langid and then, when trying how suitable it'd be for my purposes, stumbled over this: ``` ❯ echo 'در' | langid -l ar,fa,ota Traceback (most...
When running batch training with -d flag, the following error outputs: line 585, in main writer.writerow(['path']+nb_classes) NameError: name 'nb_classes' is not defined Looks like there is a misplaced variable assignment....
From the readme files, I found out how to train a brand new model. Can we add new corpus to the default model rather than train a brand new model?
在Windows本机(python3.6)上运行么得问题,在Ubuntu服务器(python3.7)上报如下错误: `Traceback (most recent call last): File "lan_det.py", line 9, in print(lan_det(text)) File "lan_det.py", line 6, in lan_det return langid.classify(text) File "/home/env/rfh_01/lib/python3.7/site-packages/langid-1.1.6-py3.7.egg/langid/langid.py", line 105, in classify load_model() File "/home/env/rfh_01/lib/python3.7/site-packages/langid-1.1.6-py3.7.egg/langid/langid.py", line...
if wordn 3-gram is set in tokenize.py, the unit of max_order in DFfeatureselect.py is word or byte?Because in some langs, one string takes up several bytes.
data:image/s3,"s3://crabby-images/aa4e3/aa4e39c37e075adf2ee47e4b871d73d3abf8fbad" alt="image" Any plans to add support for it?