Jonathan Washington comments

Results 239 comments of


                                            Jonathan Washington

Parsing slows down with each subsequent parse

The amount of memory it uses grows as well. On the first few iterations it uses about 55MB, by 1000 iterations it's up to double that.

First steps towards a CG-based UD parser; point to the lexicon-proofreading-effort in the docs; some corrections in puupankki

> Note that adding the dependency labels / arcs to the out put kaz-tagger and kaz-disam (as currently is the case) breaks translators Is this something that would benefit from...

First steps towards a CG-based UD parser; point to the lexicon-proofreading-effort in the docs; some corrections in puupankki

@khannatanmai, see above about secondary tags.

two neg.ifi paradigms

> Especially if the transducer's not weighted and it will just take one analysis in a greedy manner and go on with that. I don't think there's ever ambiguity with...

two neg.ifi paradigms

I think your examples are okay, though I'm probably not the person to ask. So what do you propose for the two analyses of a form like "оқыған жок"? And...

transdcuer no longer meets Apertium Turkic standards

@mansayk, thank you for sharing your view on this—it's very helpful. I'd just like to clarify one point. You say: > Jumping all the time through the file is not...

transdcuer no longer meets Apertium Turkic standards

Okay, I have a better sense now of what the reasoning is. These are valid reasons, and I've experienced these issues myself. I like Fran's proposal—to keep "open" and "closed"...

transdcuer no longer meets Apertium Turkic standards

> I would suggest to place LEXICON Open in the very end of the file, so it is easier to find where it ends when we sort it. I'm used...

Redundant and miscategorized stems in apertium-kaz.kaz.lexc

Note, a GCI student wrote a lexc parser and lexicon deduplicator a couple years ago. Let me know if you want help digging it up.

Redundant and miscategorized stems in apertium-kaz.kaz.lexc

Relevant tools: https://github.com/apertium/apertium-on-github/issues/51