grammarinator ANTLR's NOT for chars

Changed the following two things using the NOT feature in ANTLR:

Unescape ' because in ANTLR it has to be escaped (e.g. Comment: ~'\'';) This is done with the .replace("\'", "'") Without these patch, the code will use the second char which is then the backslash instead of the quote character.
Added Unicode support. This is done by the decode("unicode-escape", "strict"). Without these patch, the use of any Unicode will also lead to backslash.

Note: Using NOT for a string is still bugged. Only the first char of a string is used and not the whole string. But this would be a bigger change in higher parts.

Mar 24 '22 17:03 38b394ce01

Decoding with unicode-escape resolves any escaping problems, no need for e.g. .replace("\'", "'") This works with all single char rules like: Comment: ~'\''; or Newline: ~'\n'; or Backslash: ~'\\'; or Unicode1: ~'\u0061'; or with strings longer then 1 char like Unicode2: ~'\u0061bc'; or Unicode2: ~'1\u0061bc';.

Limitations:

Unfortunately the used ASCII range is printables only, so chars like \n or \r are never used. The first char is 0x20 in ASCII the space.
Does not work with chars out of ASCII range, only chars within ASCII encoded as Unicode will work.

All three limitations can not be resolved by the code lines I changed here. So with this change only ' and ASCII chars encoded in Unicode will work which was not the case before.

Mar 25 '22 09:03 38b394ce01

Thanks for the PR, but the issue was resolved as part of a bigger improvement around escapes (#75).

If you find that the landed PR did not solve all the limitations, please, do open a new issue or PR.

Mar 08 '23 09:03 renatahodovan

grammarinator grammarinator copied to clipboard

ANTLR's NOT for chars

grammarinator
grammarinator copied to clipboard