Amir Plivatsky comments

Results 369 comments of


                                            Amir Plivatsky

An additional speedup improvement for long sentences - with a catch

> One of numerous discussion threads suggests using two dictionaries: the normal one, and if the parse fails, then a second one with an extra `NULL- & stuff` on every...

An additional speedup improvement for long sentences - with a catch

> How can you limit it to use the minimum null count possible? It seems it is still possible. I will test that.

An additional speedup improvement for long sentences - with a catch

> For long sentences `mk_parse_set()` may take very long time (even many minutes). > I think I can fix that, and thought that maybe such a fix can be included...

An additional speedup improvement for long sentences - with a catch

> 3\. For (1), I use something like: > `sed 's/verbosity=/d' data/en/corpus-fix-long,batch | link-parser -v=2` > For changing parse options I have to add more components to the sed arguments,...

An additional speedup improvement for long sentences - with a catch

The documentation of such a feature can be included only in the "debug" documentation. I already automated anything by scripts, and my benchmark script already contains a similar feature. So...

Idioms

Another problem that is related to idioms is their dict notation: Connectors that start with ID are reserved for idioms. This causes a problem for automatic dict generation in which...

Idioms

> The idea was to allow to backslash-escape underbars (in the dict) so they would not be recognized as idiom word delimiters The dict syntax already contains a way to...

>> double quotes I started to implement general character quoting using double-quotes but then realized it is not good enough because only full word double-quoting is allowed. Allowing to double-quote...

Idioms

The benefits of allowing escaping special characters in the dict: 1. It allows defining any string as a word ( even `a\.b` when `.b` is not a subscript). 2. If...

Strippable affix class regexes

> I am for (5) and otherwise for (2) or (1). EDIT: Fix the POSIX regex. I found a better solution, that all the regex libraries support: Instead of lookahead/lookbehind,...