nanopass-framework-racket
nanopass-framework-racket copied to clipboard
Wrong `begin` form.
This may not be solvable (or may require backtracking when constructing language). But I'll put this here anyway, and if it's not doable, we should output a better error message.
Let's say I have the following language;
#lang nanopass
(define-language Lsrc
(terminals
(symbol (s))
(number (n)))
(stmt (stmt)
n
expr
(begin stmt ...))
(expr (expr)
s
(begin expr ...)))
I cannot construct the following grammar:
(with-output-language (Lsrc stmt)
`(begin a 5))
Because it tries to use the expr begin
, rather then stmt's. (Which is understandable because an expr is a valid stmt.
So, this makes me think either we need to improve the pattern matcher to allow this, and at least try to pick the right one (if it exists anyway), or if I give an ambiguous grammar, give that as an error message.
Hey Leif,
So there are actually two separate problems here.
The first is already documented (and there is already an issue, though I think it is not spelled out very well in Issue 15). Basically, things with the same shape in the same production cannot be parsed correctly because we make the decision based on the keyword and the shape, and not based on the types with the language. There are basically two things we can do with this:
- We can identify when this happens and error out. This is kind of an unsatisfying solution, but is easier to implement.
- We can make the parser and meta-parser smarter so that it can deal with this, when the types specify a language that is not ambiguous.
I'd rather do the later, but I am not sure, yet, exactly how this will work in the meta-parser. (The parser is pretty easy to fix.)
However the language you have here, has a separate problem. Your language is actually an ambiguous language: For instance, if you have the program:
`(begin x y z)
Should this be parsed as:
(stmt:begin x y z)
or
(expr:begin x y z)
Technically, either one is valid, but the nanopass framework isn't really setup to deal with this kind of ambiguity. We could decide that the whichever appears first in your language definition is the one we use (and in fact, that is what is going to happen now, because of the earlier bug). However, I think we probably want to raise an error (or at least a warning) for this situation even if we've implemented option 2 above.
-andy:)
On August 18, 2015 at 11:04:02 AM, Leif Andersen ([email protected]) wrote:
This may not be solvable (or may require backtracking when constructing language). But I'll put this here anyway, and if it's not doable, we should output a better error message.
Let's say I have the following language;
#lang nanopass (define-language Lsrc (terminals (symbol (s)) (number (n))) (stmt (stmt) n expr (begin stmt ...)) (expr (expr) s (begin expr ...)))
I cannot construct the following grammar:
(with-output-language (Lsrc stmt) `(begin a 5))
Because it tries to use the expr begin, rather then stmt's. (Which is understandable because an expr is a valid stmt.
So, this makes me think either we need to improve the pattern matcher to allow this, and at least try to pick the right one (if it exists anyway), or if I give an ambiguous grammar, give that as an error message.
— Reply to this email directly or view it on GitHub.