spacy-stanza issues

Offset misalignment in NER using the Stanza tokenizer for French

5

Hi everyone, I just found a problem when trying to analyze a French sentence. When I run the following code: ```python snlp = stanza.Pipeline(lang="fr", verbose=False) stanzanlp = StanzaLanguage(snlp) text =...

vitojph

Add stanza constituency output

2

Since [release v1.3.0](https://github.com/stanfordnlp/stanza/releases/tag/v1.3.0), stanza has a constituency parser for English. Support for more languages will follow. It would be great if we could access the constituency parse from within the...

BramVanroy

Mutli process doesn't work

9

spaCy version: 2.2.4 spacy-stanza version: 0.2.1 stanza version: 1.0.1 It is not possible to use multiple processes in the pipeline while using the Russian model. ```import stanza import spacy from...

LasershowJack

User Warnings make parsing Late

3

I am parsing a big corpus that takes days to index. It is an arabic corpus so I need `spacy-stanza.` I have noticed that it is printing for each sentence...

chaouiy

Multi-word token expansion issue, misaligned tokens --> failed NER (German)

4

Hi, thanks for the great project! It seems like stanza performs some pre-processing to the text, which results in misalignments and failed NER. ```UserWarning: Can't set named entities because of...

flipz357

SPACE is not UPOS

4

Hey, First of all thanks for the great job! I am currently using stanza via spaCy for an small annotation projection project. However while integrating I realized that spacy-stanza uses...

bitPogo

Morphological features are lost in russian model

3

spaCy version: 2.1.9 spaCy-stanza version: 0.2.1 ```python import stanza from spacy_stanza import StanzaLanguage stanza.download('ru') snlp = stanza.Pipeline(lang="ru") nlp = StanzaLanguage(snlp) text = "Мама мыла раму" ``` Using stanza, i get...

SergeyShk

Stanza's sentencizer only works when `processors = 'tokenize,pos,lemma,depparse'`

1

Hi all, I started an NLP project where I needed high accuracy sentence segmentation, and therefore decided to use stanza. I was thrilled to find this library, since Spacy is...

namiyousef

Speed

1

Spacy Stanza is much slower than merely Stanza

hg2051

Takes too long to parse doc results

4

Hello, It takes too long to parse the doc object, i.e to iterate over sentence and tokens in them. Is that expected ? ``` snlp = stanfordnlp.Pipeline(processors='tokenize,pos', models_dir=model_dir) nlp =...

Joselinejamy

spacy-stanza
spacy-stanza copied to clipboard

Metadata

Offset misalignment in NER using the Stanza tokenizer for French

Add stanza constituency output

Mutli process doesn't work

User Warnings make parsing Late

Multi-word token expansion issue, misaligned tokens --> failed NER (German)

SPACE is not UPOS

Morphological features are lost in russian model

Stanza's sentencizer only works when `processors = 'tokenize,pos,lemma,depparse'`

Speed

Takes too long to parse doc results

← Metadata

Owner

Metadata

spacy-stanza spacy-stanza copied to clipboard

Metadata

← Metadata

Owner

Metadata

spacy-stanza
spacy-stanza copied to clipboard