choco-solver icon indicating copy to clipboard operation
choco-solver copied to clipboard

Accidental special characters from integers in regex

Open aengelberg opened this issue 10 years ago • 1 comments
trafficstars

I'd expect any integer in the <123> format in a FiniteAutomaton regex to parse that integer. But certain integers conflict with already-defined operators and therefore cause the FiniteAutomaton constructor to do something wrong:

  • FiniteAutomaton("<43>") hangs (it gets converted to ".")
  • FiniteAutomaton("<59>") hangs (it gets converted to "@")
  • FiniteAutomaton("<117>") throws an exception (it gets converted to "~")

There are likely more integers that aren't hanging or throwing, but are getting converted to other operators (like +, *, ?) and producing automata with the wrong behavior.

Have you considered putting a backslash in front of the character during the int parsing and conversion step in StringUtils.toCharExp()? e.g. "<43>" would converted to \. and wouldn't trigger the behavior of an operator if it conflicts with one.

aengelberg avatar Oct 12 '15 02:10 aengelberg

Your patch seems ok, but once again, it advocates for a deeper refactoring of REGULAR.

cprudhom avatar Oct 14 '15 09:10 cprudhom