jflex icon indicating copy to clipboard operation
jflex copied to clipboard

Unexpected exception encountered. Regex is "\/\*[{white}|.]*\*\/"

Open yaoyx108 opened this issue 2 years ago • 3 comments

Hi, as part of school work I'm using jflex to generate a scanner.

I'm trying to write a regex to handle /* comments */.

This is the error I'm seeing.

gen-scanner:
     [java] Reading "src/Scanner/minijava.jflex"
     [java]
     [java] Unexpected exception encountered. This indicates a bug in JFlex.
     [java] Please consider filing an issue at http://github.com/jflex-de/jflex/issues/new
     [java]
     [java]
     [java] Not normalised type = BAR
     [java] child 1 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [10][13] }
     [java] child 2 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [9][' '] }
     [java] jflex.exceptions.CharClassException: Not normalised type = BAR
     [java] child 1 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [10][13] }
     [java] child 2 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [9][' '] }
     [java]     at jflex.core.RegExp.checkPrimClass(RegExp.java:242)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:323)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:307)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExps.normalise(RegExps.java:293)
     [java]     at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action_part00000000(LexParse.java:1029)
     [java]     at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action(LexParse.java:2257)
     [java]     at jflex.core.LexParse.do_action(LexParse.java:598)
     [java]     at java_cup.runtime.lr_parser.parse(lr_parser.java:699)
     [java]     at jflex.generator.LexGenerator.generate(LexGenerator.java:74)
     [java]     at jflex.Main.generate(Main.java:320)
     [java]     at jflex.Main.main(Main.java:336)

BUILD FAILED
/Users/yaoyx/cse/compiler/project/csep501-23au-ao/build.xml:56: Java returned: 1

This is the regex I'm using

eol = [\r\n]
white = {eol}|[ \t]
\/\*[{white}|.]*\*\/ { /* ignore slash-star comments */ }

yaoyx108 avatar Oct 19 '23 08:10 yaoyx108

Attached is the full jflex file.

yaoyx108 avatar Oct 19 '23 08:10 yaoyx108

minijava.jflex.txt

GitHub only accepts files with certain extension. I've added .txt.

yaoyx108 avatar Oct 19 '23 08:10 yaoyx108

This does indeed look like a bug, thanks for reporting it. But it's mostly a bug in error reporting (sorry :-)). Enabling the use of macros in character classes like [..] seems to have uncovered a whole bunch of combinations that aren't properly rejected.

In this case, {white} is a full regular expression (.. | ..), not itself a character class, so it can't be used inside the [..].

For fixing your specific problem, it looks like you might be wanting just ({white}|.) instead of [{white}|.]. That said, this is equivalent to just [^] (any character, including newline).

On the danger of doing your homework for you, there is a pitfall with this: "/*" [^]* "*/" will not quite match a slash-star comment. E.g. it will match all of /* abc */ return x; /* abc */ in one go, because JFlex will always give you the longest possible match.

The expression you'd need is "/*" followed by any string that is not "*/", followed by "*/". There is a special operator for this in JFlex (not present in usual regexp engines). You can write: "/*" ~"*/"

lsf37 avatar Oct 19 '23 22:10 lsf37