Amir Plivatsky
Amir Plivatsky
The number of machine instructions of this function are about half after this change so I guess (not validated it) that no overhead added (I expect that the (unsigned char)...
[Continuing the discussion from issue #42.) >Hebrew: "LPUNC* (WORD=+ | WORD=* WORD)? RPUNC*" First I need to make it clear of what these `=` marks mean there. Originally (in...
>But it occured to me (and I also posted on that in details) that even these terms are not needed, if we just mark the possible internal links somehow, because...
> This suggests that > a more holistic approach is needed to graph rewriting: it must somehow > be performed "during" parsing, so that parsing can both guide the >...
Say we have a perfectly correct sentence. Suppose that the the parase explores 10M ranges, and 99% of them are unparsable. So we have about 10M unparsable ranges, and in...
I implemented an affix-class tokens dict check, and I get the following: ``` text link-grammar: Error: afdict_init: class { in file LPUNC: Token "en/4.0.affix" is not in the dictionary link-grammar:...
**Question:** What to do on token errors in the affix file, like token not in the dictionary? Possibility: 1. Issue a warning, but otherwise do nothing. (It will be get...
Since I added regex support to affix stripping (an upcoming PR), no much need to check the any/amy affixes. I just replacedwithhem by something like: `… .... "/[[:punct:]]$/": RPUNC+;` (I.e....
It seems QUOTES and BALLETS are only used in `is_capitalizable()` so they don't interfere with strippable affix classes. BTW: 1. What about `''.x` (2 single quotes)? Was it intended to...
> 1. What about ''.x (2 single quotes)? Was it intended to be in the dict as a synonym for a double quote? You already answered it above, sorry...