stanza icon indicating copy to clipboard operation
stanza copied to clipboard

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Results 180 stanza issues
Sort by recently updated
recently updated
newest added

**Describe the bug** When POS tagging a specific string in Spanish a RuntimeError is **reproducibly** thrown without any apparent reason. **To Reproduce** Steps to reproduce the behavior: 1. Run the...

bug

...also support for zero-node annotation! So for a sentence with underscores in it, the system would actually be able to recognize it as a possible-coreferent (i.e. zero-anaphora) and mark it...

When tokenising Vietnamese text, StanzaNLP very regularly produces tokens that are only a single consonant and punctuation, e.g. "c,". This is obviously not a word. It only happens at punctuation...

bug

StanzaNLP has been invaluable in Chinese tokenisation at scale! Still, there are some issues that regularly come up, and I'm wondering whether they are intentional. I'd also like to record...

question

I Sir, The offline stanza is working fine. I want the CoNLL-U output, can you suggest me how to get it. Thanks in advance.

question

I am running into a situation where I need to sentence-split a corpus with gold tokenization. I basically have input like this: This is a sentence . And this is...

question

I've been testing the CorefUD-trained Stanza model on English and seeing some inconsistent results, especially with regard to singletons. Since the model is trained on data that has singletons (but...

enhancement

Since spaCy now supports Python 3.13, I installed stanza and run all the tests in my Python 3.13 virtual environment—all of them passed.

Is there a way to extract hebrew root from a hebrew word? ## Hebrew root **כּתב**: - **כּ**וֹ**תב**: - **כּ**ַ**ת**ָ**ב**ָה - **מ**ִ**כ**תָ**ב**

question

For example, `Tes amis` in French could be treated as `tes` and therefore be lemmatized as `ton`

enhancement