grammars-v4
grammars-v4 copied to clipboard
SQLite grammar: top level `parse` rule accepts statements not separated by semicolons
The SQLite language syntax defines a sql-stmt-list that consists of zero, one or more sql-stmts.
Each sql-stmt should be separated by a semicolon. See https://www.sqlite.org/lang.html.
Here is a simple example with SQLite (SQLite 3.37.0):
sqlite> SELECT Title FROM Album LIMIT 2; SELECT ArtistId FROM Album LIMIT 3;
For Those About To Rock We Salute You
Balls to the Wall
1
1
2
Omitting the first semicolon (and hence not properly separating the SELECT statement) leads to an error:
sqlite> SELECT Title FROM Album LIMIT 2 SELECT ArtistId FROM Album LIMIT 3;
Error: in prepare, near "SELECT": syntax error (1)
The ANTLR grammar currently accepts the second form but IMHO shouldn't.
Root cause
The grammar-v4 grammar for the SQLite SQL dialect is defined as follows:
https://github.com/antlr/grammars-v4/blob/07314e4615982ba77864d7b8cd804c7b5d803bb0/sql/sqlite/SQLiteParser.g4#L37-L42
So parse is also valid for a list of sql_stmt_lists that each consist of a single sql_stmt only, hence accepting the SQL statements not separated by semicolons.
Instead the parse rule should probably be:
parse: (sql_stmt_list)? EOF
;
which only accepts a single sql_stmt_list at most and hence requires sql_stmts to be separated by SCOL (semicolon).
Here is a change with accompanying tests: https://github.com/antlr/grammars-v4/compare/master...juretta:grammars-v4:master#diff-8bf47bc01d73b92fd763a6f79c0a9827e7006434ce463139e2f40cecc6a2ee1a
Changing parse causes the generated parser to omit a single (optional) sql_stmt_list instead of that being a list. Not entirely sure about the ramifications here, this seems to break all code that relies on the previous behaviour. So updating to the new grammar would be a breaking change.