polyglot icon indicating copy to clipboard operation
polyglot copied to clipboard

Multilingual text (NLP) processing toolkit

Results 113 polyglot issues
Sort by recently updated
recently updated
newest added

I wrote this simple script that takes a source file and transliterate it into the target file specified from the arguments: ```python #!/usr/bin/env python import io import sys reload(sys) sys.setdefaultencoding('utf-8')...

Hi there, I'm using polyglot to do some tokenization and NER extraction and am using the output both as features in a Machine Learning model. Since I know in advance...

from polyglot.text import Text file = open('input_raw.txt', 'r') input_file = file.read() file = Text(input_file, hint_language_code='fa') list_entity = [] for sentence in file.sentences: for entity in sentence.entities: #print(entity) list_entity.append(entity) # print(list_entity)...

After doing polyglot download of LANG:sw and running text.pos_tags on a sentence detected as Swahili, I get this error: ValueError: Package 'pos2.sw' not found in index Would it be possible...

In the following example, it detect as traditional chinese language, but it just simplyfied chinese language: ``` 银行本票的提示付款期限 什么是“快捷付”功能 ``` How to solve this problem? Additional, for the text in...

it will be great if we have a function to get the lemmatized version of a word.

I was using Polyglot for tokenizing documents with hashtags in Swedish. I recognized that the tokenizer splits the first letters of the hashtags after an emoji as separate words. For...

When you try to set the path using the ```polyglot download``` options and provide a path having upper case letters, it prints back the whole path in lower case saying...

When running `pip3 install polyglot==16.7.4` no requirements get installed. See (this uses a fresh [python 3.5 docker image](https://github.com/docker-library/python/blob/855b85c8309e925814dfa97d61310080dcd08db6/3.5/Dockerfile) with nothing else installed): ``` $ docker run --rm -it python:3.5 pip3...