lingua-py icon indicating copy to clipboard operation
lingua-py copied to clipboard

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Results 30 lingua-py issues
Sort by recently updated
recently updated
newest added

Hi! I'm reaching out to kindly request the availability of the ONNX file for the language detector model that is currently being utilized within the project. Thanks.

Would be interesting to have "prompt + LLM" accuracy in the benchmark as well. A simple prompt to GPT4 and restricting the output with LMQL should be quite straightforward.

Hi! Is there a specific method to get confidence scores when detecting multiple languages in mixed-language texts? Should I calculate it myself based on the DetectionResult occurrences?

When I put in german sentences with japanese words quoted then it might happen, that lingua claims it's 100% japanese. For example: `Wir stoßen an: "かんぱい". Er lächelte.` (in english,...

enhancement

Based on the reported graphs I was expecting a high single-word detection accuracy, however when I tested some simple greetings, results were quite poor. I'm thinking that I might have...

Since the Language class and IsoCode classes are Rust classes, it would be nice to provide the `from_str` methods back so I can dynamically import iso codes.

Hi there, I am using your repo for my master's thesis. How I can cite it? Thanks! :)

### Discussed in https://github.com/pemistahl/lingua-py/discussions/128 Originally posted by **thirtha-prasad** March 3, 2023 Getting below error, Package in installed using PIP and available in the python path. ImportError: cannot import name 'Language'...

Given a text in Ukrainian, two methods provide two completely different results. ```python detector = LanguageDetectorBuilder.from_all_languages().build() string = "Що найбільше подобається читачам у жанрі \"Фентезі\"?" print(detector.compute_language_confidence_values(string)) >>> [ConfidenceValue(language=Language.KAZAKH, value=1), ConfidenceValue(language=Language.AFRIKAANS,...

I just wanted to share these considering the comparisions mentioned in the readme. * I see that fasttext seems to be better overall for my usecase, though fasttext may just...