stanza issues

Dependency parsing: is there some way to discourage multiple `nsubj` dependents?

9

A very weird English tree produced by Stanza 1.6.0 in the [demo](http://stanza.run/): > My cousin my extremely rude colleague admired last year chewed the chicken enthusiastically. In UD, no word...

nschneid

enhancement

question

Using multiple models in NER

6

I want to run the following code, but an error occurred. import stanza pipe = stanza.Pipeline("en", processors="tokenize,ner", package={"ner": ["ncbi_disease", "ontonotes"]}) doc = pipe("John Bauer works at Stanford and has hip...

linlinloo

question

Proiel parser exhibits odd behaviour with respect to punctuation

45

**Describe the bug** If there is a comma in the parsed sentence, the PROIEL model: a) does not tokenize the comma, it just bundles it with the preceding word. The...

pseudomonas

bug

Batch Sizes not used anywhere? Out of mem...

2

**Describe the bug** I have some out of mems with 35 GB processes, stanze could be tracked down as reason. **To Reproduce** Steps to reproduce the behavior: 1. Take e.g....

andrePankraz

bug

coref not using proper noun

1

Hi, Thanks for this tool. I noticed that sometimes coref doesn't use the proper noun, is there any way to make it use the proper noun? Here is my code...

fakerybakery

bug

MPS Apple Silicon GPU not detected when using "use_gpu=True"

6

Hi, The Apple Silicon GPU (MPS) is not detected, even when using `use_gpu=True`. Is there any way to use the MPS GPU? Thanks!

fakerybakery

bug

Tokenizer doesn't respect combined_electra-large's max_length

9

**Describe the bug** When parsing a long text using the latest "combined_electra-large" model, I get the error: ``` Token indices sequence length is longer than the specified maximum sequence length...

rmalouf

bug

How to show progress bar in pipeline? [QUESTION]

4

Hi, I have been using stanza bulkprocess to tokenize and ssplit a rather large text stored in a dataframe. My question is how to show progress bar when running the...

Hansyvea

enhancement

question

"SpaceAfter=No" not being included in misc field of Word objects

8

**Describe the bug** Tokens without a space after them in the original text do not include that info in the misc field of the Word object or in the conllu...

tomlup

bug

fixed on dev

Language Tamil - Wrong POS tag for "ஊறு" (VERB instead of ADJ)

20

**Describe the bug** [ஊறு](https://ta.wiktionary.org/wiki/%E0%AE%8A%E0%AE%B1%E0%AF%81) **To Reproduce** Steps to reproduce the behavior: ``` import logging import stanza logging.getLogger('stanza').setLevel(logging.ERROR) # Download and initialize the Tamil model # stanza.download('ta') nlp = stanza.Pipeline(lang='ta') #...

SmartManoj

bug

stanza
stanza copied to clipboard

Metadata

Dependency parsing: is there some way to discourage multiple `nsubj` dependents?

Using multiple models in NER

Proiel parser exhibits odd behaviour with respect to punctuation

Batch Sizes not used anywhere? Out of mem...

coref not using proper noun

MPS Apple Silicon GPU not detected when using "use_gpu=True"

Tokenizer doesn't respect combined_electra-large's max_length

How to show progress bar in pipeline? [QUESTION]

"SpaceAfter=No" not being included in misc field of Word objects

Language Tamil - Wrong POS tag for "ஊறு" (VERB instead of ADJ)

← Metadata

Owner

Metadata

stanza stanza copied to clipboard

Metadata

← Metadata

Owner

Metadata

stanza
stanza copied to clipboard