sqlparse icon indicating copy to clipboard operation
sqlparse copied to clipboard

Statement.get_type() does not skip comments between subqueries.

Open shikanime opened this issue 4 years ago • 1 comments

SQL query can contain comment in between WITH multiple query but the get_type() implementation doesn't skip them

>>> query, = sqlparse.parse("""
WITH A AS (),
          -- A comment about the B subquery...
          B AS ()
SELECT * FROM B
""")
>>> query.get_type()
'UNKNOWN'

Without the comment:

>>> query, = sqlparse.parse("""
WITH A AS (),
          B AS ()
SELECT * FROM B
""")
>>> query.get_type()
'SELECT'

shikanime avatar Jul 16 '21 10:07 shikanime

Ran into this problem too 😱

I think the fix might be better done in the underlying sql parsing rather than in get_type. Both queries above should have roughly the same tree, but sqlparse incorrectly breaks up the identifiers instead of grouping them in the same IdentifierList.

Tokens with comment:

[
  <CTE 'WITH' at 0x7F87354A9520>,
 <Whitespace ' ' at 0x7F87354A9760>,
 <Identifier 'A AS ()' at 0x7F8727BC8BA0>,
 ...,
 <Comment '-- A c...' at 0x7F8727BC8C80>,
 <Identifier 'B AS ()' at 0x7F8727BC86D0>,
 <DML 'SELECT' at 0x7F87354A9580>,
]

Tokens without comment:

[
 <CTE 'WITH' at 0x7F872738A340>,
 <Whitespace ' ' at 0x7F872738A580>,
 <IdentifierList 'A AS (...' at 0x7F8727BC84A0>,
 <DML 'SELECT' at 0x7F87354A9580>,
 ...
]

@vmuriart I wonder if you had any thoughts on this? I thought this comment might be relevant.

serenajiang avatar Sep 07 '21 23:09 serenajiang