Peng Qi comments

Results 13 comments of


Peng Qi

Pretokenized and MWT

@anantvir feel free to start working on this and create a PR against the `dev` branch! I'm afraid though this is not that straightforward--the MWT model relies on the tokenizer...

rose lemma problem

@AngledLuffa the first thing to check is probably whether we have (NN, "rose") -> "rose" in the UD training set. If not, adding one to the training data would help...

Support Swahili

@tbm very likely that that date is a projected release date for UD v2.8. We will definitely consider training a Swahili model should the data become available as part of...

Coreference resolution

The pure Python-based part does not yet have coref models/support, but you can always access the coref models of CoreNLP through the Python interface!

http://stanza.run/ Website crashes

It could be that @Aaron-Ge 's network environment blocks the CDNs that we use for some of the JavaScript files. One potential solution would be to serve them from our...

Enhanced Dependencies Support

@AngledLuffa It could be adapted to allow multiple connections per depedent, but doesn't support it out of the box.

Better batch processing

Not quite streaming, but we recently made a change to allow batched processing that respects Document boundaries (#577)

Better batch processing

@johann-petrak I agree -- this is a first step towards streaming, but is definitely not quite there yet. We were trying to add complexity one step at a time, and...

Adding support for Thai Language

@korakot I think the best way is to get in touch with the Universal Dependencies community first, to start building that treebank and correcting Thai-PUD. If you could give us...

Adding support for Thai Language

@AngledLuffa could you also share some examples of what the plaintext file looks like so that native speakers can help us diagnose potential issues?