Grammar-Kit
Grammar-Kit copied to clipboard
Highlight escape sequences in tokens strings
I'm a beginner to Grammar-Kit, trying to produce a grammar for BSD-style Makefiles. To get used to Grammar-Kit, I started with this grammar:
{
parserClass="org.pkgsrc.intellij.mk.parser.BsdMakefileParser"
extends="com.intellij.extapi.psi.ASTWrapperPsiElement"
psiClassPrefix="BsdMk"
psiImplClassSuffix="Impl"
psiPackage="org.pkgsrc.intellij.mk.psi"
psiImplPackage="org.pkgsrc.intellij.mk.psi.impl"
elementTypeHolderClass="org.pkgsrc.intellij.mk.psi.BsdMakefileTypes"
elementTypeClass="org.pkgsrc.intellij.mk.psi.BsdMakefileElementType"
tokenTypeClass="org.pkgsrc.intellij.mk.psi.BsdMakefileTokenType"
tokens=[
T_NL = "\n"
T_COMMENT = 'regexp:[ \t\w]+'
]
}
file ::= line*
line ::=
empty_line
| comment_line
empty_line ::=
T_NL
comment_line ::=
comment T_NL
comment ::= "#" T_COMMENT
Surprisingly to me, the T_NL token definition doesn't match a newline.
From the other grammar examples I looked at, I concluded that the tokens would just be ordinary string literals with the usual escape sequences. I saw regular expressions with the typical double-backslashes, therefore I assumed that single backslashes would either just work as in Java or Kotlin, or would produce visible syntax errors in the BNF editor. ("unknown escape sequence")
When I replaced the simple "\n" with "regexp:\n", it worked. I had expected more help from the BNF editor here by highlighting the regexp: part (since I had tried regex: first) and by using the Language Injection for regular expressions.
I also wonder why in the regular expression for T_COMMENT I can simply write \w instead of \\w. From the Grammar.bnf file I had not expected this.
GK passes regexp fragments further AS IS and saves on \\\\ here.