grammarinator
grammarinator copied to clipboard
ANTLR's NOT for chars
Changed the following two things using the NOT feature in ANTLR:
- Unescape ' because in ANTLR it has to be escaped (e.g.
Comment: ~'\'';
) This is done with the.replace("\'", "'")
Without these patch, the code will use the second char which is then the backslash instead of the quote character. - Added Unicode support. This is done by the
decode("unicode-escape", "strict")
. Without these patch, the use of any Unicode will also lead to backslash.
Note: Using NOT for a string is still bugged. Only the first char of a string is used and not the whole string. But this would be a bigger change in higher parts.
Decoding with unicode-escape
resolves any escaping problems, no need for e.g. .replace("\'", "'")
This works with all single char rules like:
Comment: ~'\'';
or Newline: ~'\n';
or Backslash: ~'\\';
or Unicode1: ~'\u0061';
or with strings longer then 1 char like Unicode2: ~'\u0061bc';
or Unicode2: ~'1\u0061bc';
.
Limitations:
- Unfortunately the used ASCII range is printables only, so chars like \n or \r are never used. The first char is 0x20 in ASCII the space.
- Does not work with chars out of ASCII range, only chars within ASCII encoded as Unicode will work.
All three limitations can not be resolved by the code lines I changed here. So with this change only '
and ASCII chars encoded in Unicode will work which was not the case before.
Thanks for the PR, but the issue was resolved as part of a bigger improvement around escapes (#75).
If you find that the landed PR did not solve all the limitations, please, do open a new issue or PR.