stanza
stanza copied to clipboard
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
Before you start, make sure to check out: * Our documentation: https://stanfordnlp.github.io/stanza/ * Our FAQ: https://stanfordnlp.github.io/stanza/faq.html * Github issues (especially closed ones) Your question might have an answer in these...
Hi, I've been really liking how Stanza just "works" out of the box since the last month or so. However, I have recently hit a wall and the documentation is...
**Describe the bug** The lemma of rose(rose flower) is rise in 1.4.0 **To Reproduce** Steps to reproduce the behavior: Take the sentence "I gave her a rose" as example, the...
I'd like to add NER model for Polish. For now, I wonder what else is needed. **Datasets** - Char-LM: [Wikipedia Subcorpus](http://clip.ipipan.waw.pl/PolishWikipediaCorpus) - NER annotations: [NKJP Corpus](http://clip.ipipan.waw.pl/NationalCorpusOfPolish) **Baseline models** - [char-lm...
Hi, I got this following output from NER process. I want this in the form of a dataframe .In that case,"id,text,upos,xpos,ner" shoud be column names.Is that possible to convert into...
Hi, I trained a **langid model** with my dataset following these [steps](https://stanfordnlp.github.io/stanza/langid.html#training-your-own-model) and ending with this method: ```python python -m stanza.models.lang_identifier --data-dir data --eval-length 10 --randomize --save-name model.pt --num-epochs 100...
Hi all! First time posting a question, feel free to correct me if I'm not following conventions. I'm a Python newbie trying to start an NLP project, so all help...
Hello, Im trying to add a new language in Stanza and for that im following **https://stanfordnlp.github.io/stanza/new_language.html** this link, so i already have language data in conllu format and using **python3...
I started a server using the following command line in a Ubuntu hyper-v server on winserver 2016: java -Xmx16g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -serverProperties StanfordCoreNLP-chinese.properties -port 9009 -timeout 150000 When I...
Hi, I was checking your lemmatization for Hindi and Urdu and found that possessive [genitive] case markers in Hindi and Urdu are wrongly lemmatized. It refers to the Hindi possessive...