Allow or check overwriting of build-in token types
BNFC passes this
EId. Expr ::= Ident;
EAdd. Expr ::= Expr "+" Expr;
token Ident (letter (letter)*);
but it fails in the Haskell backend:
happy -gca Ident/Par.y
Ident/Par.y: 25: multiple use of 'L_ident'
Ident/Par.y: 27: multiple use of 'L_ident'
Ident/Par.y: multiple use of 'L_ident'
Ident/Par.y: multiple use of 'L_ident'
Ideally, token Ident ... should overwrite the built-in definition of Ident.
At least, there should be an error message from bnfc.
I started to work on this towards allowing the overwrite, but I realized I will have to modify every single backend for this. :(
(I expected the implementation of bnfc to be a bit more modular...)
Yes. Right now each backend adds its own hard-coded definition for the built-in token types.
I think the right way to fix this is to add the built-in token categories to the CF object that is then passed to the backend. Once the lbnf file has been parsed, we should add the built-in token types that are 1) used in the grammar and 2) not defined by the user.
if you look in src/BNFC/Lexing.ht the regular expressions for each of the built-in tokens are defined there. What's left is to add them to the cf value in GetCF and remove the hard coded values from the different backends (but GetCF being a bit of a mess that might not be the easiest job :confused:)
The good thing is that now that there is a big regression test suite, changing all the backends is a bit less scarry than it used to be...
I assigned this issue to you since you said you started working on this. :+1:
Ok, I will have a go at it. I already changed Lexing.hs, but then I have to do the refactoring so that this change is picked up by the backends.
This issue gets more weight by problems such as https://github.com/BNFC/bnfc/issues/302#issuecomment-720151556 .