rascal
rascal copied to clipboard
context-free sequence containing a nullable fails on empty
Describe the bug
Noticed and isolated by @rodinaarssen
rascal>syntax X = ("(" () ")");
ok
rascal>[X] "()"
|TODO:///|: ParseError(|prompt:///|(0,0,<1,0>,<1,0>))
rascal>
The issue goes away if we remove the (...) sequence wrapper. It stays for any other nullable nonterminal replaced for the () empty non-terminal.
Wrote a failing "Sequence4" test for this: https://github.com/usethesource/rascal/pull/2214
If we add spaces the issue goes away. So it seems this has to do with sequence members being empty:
rascal>syntax X = ("(" () ")");
ok
rascal>layout L = [\ ]*;
ok
rascal>[X] "( )"
X: (X) `( )`
rascal>[X] "( )"
X: (X) `( )`
rascal>[X] "()"
|TODO:///|: ParseError(|prompt:///|(0,0,<1,0>,<1,0>))
In this case we have 5 sequence members:
- "("
- layout
- ()
- layout
- ")"
With only one space given, the parse is ambiguous as it should be:
rascal>/amb(_) := [X] "( )"
bool: true
With two spaces it's the same. Only when no spaces are given, the parse fails erroneously.
No follow restrictions or other filters are in play here. I have removed the generated syntax for meta holes from the Sequence4 test, for simplicity's sake.
This one is pretty easy to explain. Empty sequences aren't currently supported.
See: https://github.com/usethesource/rascal/blob/d4cb8063195343cdb3de0039067e0f886e9cc9c9/src/org/rascalmpl/parser/gtd/stack/SequenceStackNode.java#L82
Adding support may just be as easy as changing the return value of the canBeEmpty method to true in case the children array has zero length 😉 . The resulting tree should then contain an epsilon, like with optionals and star lists.
The same thing goes for alternatives, in case you'd want to update it's semantics as well: https://github.com/usethesource/rascal/blob/d4cb8063195343cdb3de0039067e0f886e9cc9c9/src/org/rascalmpl/parser/gtd/stack/AlternativeStackNode.java#L84