choco-solver
choco-solver copied to clipboard
Accidental special characters from integers in regex
I'd expect any integer in the <123> format in a FiniteAutomaton regex to parse that integer. But certain integers conflict with already-defined operators and therefore cause the FiniteAutomaton constructor to do something wrong:
FiniteAutomaton("<43>")hangs (it gets converted to".")FiniteAutomaton("<59>")hangs (it gets converted to"@")FiniteAutomaton("<117>")throws an exception (it gets converted to"~")
There are likely more integers that aren't hanging or throwing, but are getting converted to other operators (like +, *, ?) and producing automata with the wrong behavior.
Have you considered putting a backslash in front of the character during the int parsing and conversion step in StringUtils.toCharExp()? e.g. "<43>" would converted to \. and wouldn't trigger the behavior of an operator if it conflicts with one.
Your patch seems ok, but once again, it advocates for a deeper refactoring of REGULAR.