sql-formatter icon indicating copy to clipboard operation
sql-formatter copied to clipboard

Jison integration

Open nene opened this issue 2 years ago • 1 comments

Experiment of using Jison.

Key takeaways so far:

  • The lexer API is quite different. Instead of returning token objects, the main lex() method returns token type and stores actual text to yytext field.
  • The parser is only able to match token types. it can't also look at the actual text/value of the token. Like, in our case we can't have a general KEYWORD token, but would need separate types for each keyword: SELECT, FROM, AND, etc.
  • While Bison has support for GLR parsing algorithm, which allows handling ambiguous grammars, Jison doesn't support that. Jison does allow solving some conflicts using operator precedence, but it doesn't seem to support all that Bison does and I'm not sure if that would work for us. For the current ambiguous grammar, which I wrote, Jison gives error messages, but it actually still produces a working parser which does solve the ambiguity by just going with the first possible match, but that's definitely not the way to go.

nene avatar Jul 07 '22 19:07 nene

One thing I do like about Jison is that its grammar files are much more succinct to write. Mainly because in Nearly syntax one needs to wrap everything in anonymous functions. For comparison:

Nearley:

parenthesis -> "(" exp ")" {% ([open, expr, close]) => ({type: 'parenthesis', children: expr}) %}

Jison:

parenthesis: '(' exp ')' {$$ = {type: 'parenthesis', children: $expr}};

nene avatar Jul 07 '22 20:07 nene

Declining this as Nearley looks like a better candidate for our existing implementation.

inferrinizzard avatar Aug 17 '22 14:08 inferrinizzard

Agreed. We're unlikely to use it.

nene avatar Aug 17 '22 16:08 nene