neotoma
neotoma copied to clipboard
Fix charclass parser to ignore whitespace and quote backslash
Hi, working on Javasript PEG grammar for neotoma, I found that it is not very convinient to use charclasses - it uses whitespace and doesn't quote backslashes, so I did this patch to fix it. PEG ignores whitespace everywhere so I thought it is right thing to do in charclasses too and this makes JS grammar whith pretty big charclasses more readable. Like this: Before:
UnicodeLetter <- [\\p{Lu}\\p{Ll}\\p{Lt}\\p{Lm}\\p{Lo}\\p{Nl}];
After:
UnicodeLetter <- [ \p{Lu}
\p{Ll}
\p{Lt}
\p{Lm}
\p{Lo}
\p{Nl} ];
Travis doesn't seem to like R14
Looks like this is due to outdated travis config. #35 should fix this.
It would be nice to have the wiki reflect this change too.
@metadave @mkurkov I'm not sure I will accept this change yet. It seems innocuous but might break other users' grammars.
@seancribbs Well, I see, it is not backward compatible. I of course can rewrite grammar so charclasses will not contain whitespaces and newlines, but I think this change in line with PEG syntax. Maybe we can have it in version 2.0, I see you are working on it in separate branch?
@mkurkov Right, and sadly that's really far off, given the huge task I have taken on there. In an ideal world, the character class will be compiled out to a case expression rather than using re
internally -- unless I find that re
has better and more predictable performance.