Amir Plivatsky
Amir Plivatsky
@linas, To solve the split problem you pointed out in your comment on MPUNC, I implemented an MPUNC regex mechanism that uses lookahead/lookbehind (directly or indirectly) in a try not...
I said above: > just don't MPUNC-split words that match a regex [...] > I will try to implement that, [...] If a word contains 2 kinds of punctuations, one...
In addition to fixing `anysplit.c`, I fixed **only** the `amy` dict... I didn't look at the `any` dict. I will try to fix it too. But note that 4 parts...
I took a glance at `any`. Its dict is not designed for morphology at all. So maybe no change is needed, and you need to test the `amy` dict with...
The problem is that the current amy/4.0.affix in the repository is **not** the version that I included in PR #481. If you replace it with the version I provided, then...
> Side question: for Hebrew, if I had to split a word into all of its morphological components, how many pieces might it have (in the common cases)? I get...
I have just said: > The problem is that the current amy/4.0.affix in the repository is not the version that I included in PR #481. > If you replace it...
The actual problem is because 4 parts and more are currently translated to "multi suffix". I.e., `adsfasdfasdfasdf` can be broken as: `adsf= asdfa.= =sdfa =sdf` But `amy/4.0.dict` doesn't provide a...
> For 8-12 word sentences, the result can be one sane morphism in a million, which is far far too many to examine, just to find one that works. It...
With more than 3 parts you need the said dict change... With a >20 word it looks fine for me. To see the sampling, use the following: `link-parser amy -m...