John Bauer

Results 1064 comments of John Bauer

Well, the good news is, it is definitely possible to make it stable. The bad news is, this requires either NestedTensor or PackedSequence. Unfortunately, the LSTM we use isn't compatible...

Conferred with my PI (@manning) and we were thinking, maybe just make the PackedSequence the default behavior so that it's always repeatable results. Just take the hit on the efficiency......

Unfortunately, NestedTensor doesn't support these operations, at least as of Torch 2.4. Perhaps one day.

Version 1.11.0 uses PackedSequence, as explained above. This makes it a little slower but avoids this bug

Ah, I figured it out. When you create the document via `CoNLL.conll2doc`, it creates sentences and words from the conll, but doesn't stitch together the entire document text into a...

Should be fixed in the `multidoc_tokenize` branch. If that's no longer there by the time you get this message, it's because I merged it after the unit tests ran. I'll...

When I look for that word in the training data, it is labeled `Gender=Masc` in both of the bigger Romanian treebanks: ``` [john@localhost UD_Romanian-RRT]$ grep Sistemul *conllu | grep -v...

Is the conclusion that there's nothing to be done? It would basically require an overhaul of the dataset or a special case of some kind for Romanian

My PI points out that COLLINS.pm is part of evalb and - collapses PRT into ADVP - deletes all punct nodes But the version of evalb labeled "the latest" has...