Peng Qi
Peng Qi
@anantvir feel free to start working on this and create a PR against the `dev` branch! I'm afraid though this is not that straightforward--the MWT model relies on the tokenizer...
@AngledLuffa the first thing to check is probably whether we have (NN, "rose") -> "rose" in the UD training set. If not, adding one to the training data would help...
@tbm very likely that that date is a projected release date for UD v2.8. We will definitely consider training a Swahili model should the data become available as part of...
The pure Python-based part does not yet have coref models/support, but you can always access the coref models of CoreNLP through the Python interface!
It could be that @Aaron-Ge 's network environment blocks the CDNs that we use for some of the JavaScript files. One potential solution would be to serve them from our...
@AngledLuffa It could be adapted to allow multiple connections per depedent, but doesn't support it out of the box.
Not quite streaming, but we recently made a change to allow batched processing that respects Document boundaries (#577)
@johann-petrak I agree -- this is a first step towards streaming, but is definitely not quite there yet. We were trying to add complexity one step at a time, and...
@korakot I think the best way is to get in touch with the Universal Dependencies community first, to start building that treebank and correcting Thai-PUD. If you could give us...
@AngledLuffa could you also share some examples of what the plaintext file looks like so that native speakers can help us diagnose potential issues?