pseudomonas

Results 41 comments of pseudomonas

@AngledLuffa Reading the docs at https://stanfordnlp.github.io/stanza/new_language.html it looks like unlabelled text is _only_ good for improving NER/Sentiment/Constituency parsing and not for any of the tasks I'm using (tokenize, lemma, POS,...

I feel like in the long run it would be nice to be able to put a standard-architecture language model in there and have the stanza training script do the...

Well, I could give it a whirl if you can point me at docs on how to do the fine-tuning and plumbing it into the system; this is stuff I...

> do you think I can be more helpful starting with Combining the treebanks seems like, if it can be done, it will provide benefits; and a BERT can presumably...

@Jemoka I think in terms of improving performance _longer-term_ across Stanza, being able to leverage BERT-integration would be good. I'm probably going to try @AngledLuffa's suggestion https://github.com/stanfordnlp/stanza/issues/1311#issuecomment-1828961531 in any case....

@AngledLuffa if I'm training a model and the training is interrupted, what's the command-line flags for "resume training starting with this saved checkpoint?"

it took my little computer over a day to reproduce the benchmark, so I might try running the BERT one on my work's cluster with GPUs…

Your baseline scores (Model==None) are rather higher than those on https://stanfordnlp.github.io/stanza/performance.html assuming that POS is XPOS rather than UPOS; that page has UPOS = 92.41; XPOS = 85.13 ; LAS=73.97.

I've found a different but related issue with both perseus and proiel parsers, which is that they perform incredibly badly with accents stripped out (they do things like processing definite...

I can give it a try. I see that there are [various packages](https://stackoverflow.com/questions/36399381/whats-the-fastest-way-of-checking-if-a-point-is-inside-a-polygon-in-python) with an "is point within polygon" things so I could probably hack together something using the `.filter`...