sql-formatter
sql-formatter copied to clipboard
Jison integration
Experiment of using Jison.
Key takeaways so far:
- The lexer API is quite different. Instead of returning token objects, the main
lex()
method returns token type and stores actual text toyytext
field. - The parser is only able to match token types. it can't also look at the actual text/value of the token. Like, in our case we can't have a general KEYWORD token, but would need separate types for each keyword: SELECT, FROM, AND, etc.
- While Bison has support for GLR parsing algorithm, which allows handling ambiguous grammars, Jison doesn't support that. Jison does allow solving some conflicts using operator precedence, but it doesn't seem to support all that Bison does and I'm not sure if that would work for us. For the current ambiguous grammar, which I wrote, Jison gives error messages, but it actually still produces a working parser which does solve the ambiguity by just going with the first possible match, but that's definitely not the way to go.
One thing I do like about Jison is that its grammar files are much more succinct to write. Mainly because in Nearly syntax one needs to wrap everything in anonymous functions. For comparison:
Nearley:
parenthesis -> "(" exp ")" {% ([open, expr, close]) => ({type: 'parenthesis', children: expr}) %}
Jison:
parenthesis: '(' exp ')' {$$ = {type: 'parenthesis', children: $expr}};
Declining this as Nearley looks like a better candidate for our existing implementation.
Agreed. We're unlikely to use it.