stanza icon indicating copy to clipboard operation
stanza copied to clipboard

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Results 180 stanza issues
Sort by recently updated
recently updated
newest added

**Describe the bug** `stanza.download()` fails to download resources from a host that sends a [chunked response](https://en.wikipedia.org/wiki/Chunked_transfer_encoding). ```python In [1]: import stanza In [2]: stanza.download('en') --------------------------------------------------------------------------- TypeError Traceback (most recent call...

bug
fixed on dev

**Is your feature request related to a problem? Please describe.** I wrote a coreference resolver based on my requirements using the coref model as base to create clusters. Sometimes in...

enhancement

**Describe the bug** In `yo como carne`, `como` is identified as `upos SCONJ`, while it should be `VERB`. I am running this pipeline: ``` { "text": "Yo como carne.", "processors":...

bug

I encountered an issue while training and evaluating models using the specified setup. When the training process completed, a "Permission Denied" error occurred with the temporary file used to save...

bug

**Describe the bug** Evaluating "Ich wasche meine Hände." in Stanza 1.11 leads to "Hände" being treated as a verb with `lemma=hinden`. There is no verb "hinden" in German, and Hände...

bug

Hello, I have multiple Tregex patterns and want to use the CoreNLPClient.tregex()method to get matching results for sentences. Do I need to call tregex()multiple times for multiple patterns, or can...

question
stale

Would a morpheme segmentation processor that turns arbitrary text into morphemes be a viable feature? My friend and I have been working on a library based on a model in...

enhancement

I would like to use the coref processor for dialogues where I know the speaker of each sentence. This should help eliminate spurious I/you coref chains that are obviously wrong...

question

in many places it does things such as ``` deprel_seqs = [self.vocab['deprel'].unmap([preds[1][i][j+1][h] for j, h in enumerate(hs)]) for i, hs in enumerate(head_seqs)] ``` which, while unlikely, includes PAD and the...

bug

I get out of memory errors on unpunctuated text input. And I believe the reason might be the batch dividing method on the TokenizeProcessor. The docs claim that the batches...

question