sqllineage
sqllineage copied to clipboard
Error parsing tsql without semicolons
Describe the bug Error parsing tsql containing 'group by' statement without semicolon. add TSQL_NO_SEMICOLON config.
SQL
select id from foo group by id
select * from bar
-
if
CLI (Command Line Interface): provide the command you're calling and the output.
sqllineage -e "select id from foo group by id select * from bar" -d "tsql"
Statements(#): 1
Source Tables:
<default>.foo
Target Tables:
Python version (available via python --version
)
- 3.12.0
SQLLineage version (available via sqllineage --version
):
- 1.5.0
Additional context This type of tsql without semicolons can be parsed
select * from foo
select * from bar
This is a parser issue with sqlfluff that we can verify with:
sqlfluff parse test.sql --dialect=tsql
There's only one statement parsed when GROUP BY clause is added:
select id from foo group by id
select * from bar
[L: 1, P: 1] |file:
[L: 1, P: 1] | batch:
[L: 1, P: 1] | statement:
[L: 1, P: 1] | select_statement:
[L: 1, P: 1] | select_clause:
[L: 1, P: 1] | keyword: 'select'
[L: 1, P: 7] | [META] indent:
[L: 1, P: 7] | whitespace: ' '
[L: 1, P: 8] | select_clause_element:
[L: 1, P: 8] | column_reference:
[L: 1, P: 8] | naked_identifier: 'id'
[L: 1, P: 10] | [META] dedent:
[L: 1, P: 10] | whitespace: ' '
[L: 1, P: 11] | from_clause:
[L: 1, P: 11] | keyword: 'from'
[L: 1, P: 15] | whitespace: ' '
[L: 1, P: 16] | from_expression:
[L: 1, P: 16] | [META] indent:
[L: 1, P: 16] | from_expression_element:
[L: 1, P: 16] | table_expression:
[L: 1, P: 16] | table_reference:
[L: 1, P: 16] | naked_identifier: 'foo'
[L: 1, P: 19] | [META] dedent:
[L: 1, P: 19] | whitespace: ' '
[L: 1, P: 20] | groupby_clause:
[L: 1, P: 20] | keyword: 'group'
[L: 1, P: 25] | whitespace: ' '
[L: 1, P: 26] | keyword: 'by'
[L: 1, P: 28] | [META] indent:
[L: 1, P: 28] | whitespace: ' '
[L: 1, P: 29] | column_reference:
[L: 1, P: 29] | naked_identifier: 'id'
[L: 1, P: 31] | whitespace: ' '
[L: 1, P: 32] | newline: '\n'
[L: 2, P: 1] | expression:
[L: 2, P: 1] | select_statement:
[L: 2, P: 1] | select_clause:
[L: 2, P: 1] | keyword: 'select'
[L: 2, P: 7] | [META] indent:
[L: 2, P: 7] | whitespace: ' '
[L: 2, P: 8] | select_clause_element:
[L: 2, P: 8] | wildcard_expression:
[L: 2, P: 8] | wildcard_identifier:
[L: 2, P: 8] | star: '*'
[L: 2, P: 9] | [META] dedent:
[L: 2, P: 9] | whitespace: ' '
[L: 2, P: 10] | from_clause:
[L: 2, P: 10] | keyword: 'from'
[L: 2, P: 14] | whitespace: ' '
[L: 2, P: 15] | from_expression:
[L: 2, P: 15] | [META] indent:
[L: 2, P: 15] | from_expression_element:
[L: 2, P: 15] | table_expression:
[L: 2, P: 15] | table_reference:
[L: 2, P: 15] | naked_identifier: 'bar'
[L: 2, P: 18] | [META] dedent:
[L: 2, P: 18] | [META] dedent:
[L: 2, P: 18] | newline: '\n'
[L: 3, P: 1] | [META] end_of_file:
Whereas removing GROUP BY results in the correct AST with two statements:
select id from foo
select * from bar
[L: 1, P: 1] |file:
[L: 1, P: 1] | batch:
[L: 1, P: 1] | statement:
[L: 1, P: 1] | select_statement:
[L: 1, P: 1] | select_clause:
[L: 1, P: 1] | keyword: 'select'
[L: 1, P: 7] | [META] indent:
[L: 1, P: 7] | whitespace: ' '
[L: 1, P: 8] | select_clause_element:
[L: 1, P: 8] | column_reference:
[L: 1, P: 8] | naked_identifier: 'id'
[L: 1, P: 10] | [META] dedent:
[L: 1, P: 10] | whitespace: ' '
[L: 1, P: 11] | from_clause:
[L: 1, P: 11] | keyword: 'from'
[L: 1, P: 15] | whitespace: ' '
[L: 1, P: 16] | from_expression:
[L: 1, P: 16] | [META] indent:
[L: 1, P: 16] | from_expression_element:
[L: 1, P: 16] | table_expression:
[L: 1, P: 16] | table_reference:
[L: 1, P: 16] | naked_identifier: 'foo'
[L: 1, P: 19] | [META] dedent:
[L: 1, P: 19] | whitespace: ' '
[L: 1, P: 20] | newline: '\n'
[L: 2, P: 1] | statement:
[L: 2, P: 1] | select_statement:
[L: 2, P: 1] | select_clause:
[L: 2, P: 1] | keyword: 'select'
[L: 2, P: 7] | [META] indent:
[L: 2, P: 7] | whitespace: ' '
[L: 2, P: 8] | select_clause_element:
[L: 2, P: 8] | wildcard_expression:
[L: 2, P: 8] | wildcard_identifier:
[L: 2, P: 8] | star: '*'
[L: 2, P: 9] | [META] dedent:
[L: 2, P: 9] | whitespace: ' '
[L: 2, P: 10] | from_clause:
[L: 2, P: 10] | keyword: 'from'
[L: 2, P: 14] | whitespace: ' '
[L: 2, P: 15] | from_expression:
[L: 2, P: 15] | [META] indent:
[L: 2, P: 15] | from_expression_element:
[L: 2, P: 15] | table_expression:
[L: 2, P: 15] | table_reference:
[L: 2, P: 15] | naked_identifier: 'bar'
[L: 2, P: 18] | [META] dedent:
[L: 2, P: 18] | newline: '\n'
[L: 3, P: 1] | [META] end_of_file:
We need to fix it on sqlfluff side.