pseudomonas comments

Results 41 comments of


                                            pseudomonas

Proiel parser exhibits odd behaviour with respect to punctuation

@AngledLuffa Reading the docs at https://stanfordnlp.github.io/stanza/new_language.html it looks like unlabelled text is _only_ good for improving NER/Sentiment/Constituency parsing and not for any of the tasks I'm using (tokenize, lemma, POS,...

Proiel parser exhibits odd behaviour with respect to punctuation

I feel like in the long run it would be nice to be able to put a standard-architecture language model in there and have the stanza training script do the...

Proiel parser exhibits odd behaviour with respect to punctuation

Well, I could give it a whirl if you can point me at docs on how to do the fine-tuning and plumbing it into the system; this is stuff I...

Proiel parser exhibits odd behaviour with respect to punctuation

> do you think I can be more helpful starting with Combining the treebanks seems like, if it can be done, it will provide benefits; and a BERT can presumably...

Proiel parser exhibits odd behaviour with respect to punctuation

@Jemoka I think in terms of improving performance _longer-term_ across Stanza, being able to leverage BERT-integration would be good. I'm probably going to try @AngledLuffa's suggestion https://github.com/stanfordnlp/stanza/issues/1311#issuecomment-1828961531 in any case....

Proiel parser exhibits odd behaviour with respect to punctuation

@AngledLuffa if I'm training a model and the training is interrupted, what's the command-line flags for "resume training starting with this saved checkpoint?"

Proiel parser exhibits odd behaviour with respect to punctuation

it took my little computer over a day to reproduce the benchmark, so I might try running the BERT one on my work's cluster with GPUs…

Proiel parser exhibits odd behaviour with respect to punctuation

Your baseline scores (Model==None) are rather higher than those on https://stanfordnlp.github.io/stanza/performance.html assuming that POS is XPOS rather than UPOS; that page has UPOS = 92.41; XPOS = 85.13 ; LAS=73.97.

Proiel parser exhibits odd behaviour with respect to punctuation

I've found a different but related issue with both perseus and proiel parsers, which is that they perform incredibly badly with accents stripped out (they do things like processing definite...

Polygons other than rects for crop (etc)

I can give it a try. I see that there are [various packages](https://stackoverflow.com/questions/36399381/whats-the-fastest-way-of-checking-if-a-point-is-inside-a-polygon-in-python) with an "is point within polygon" things so I could probably hack together something using the `.filter`...