superset icon indicating copy to clipboard operation
superset copied to clipboard

SQL parser incorrectly identifies SQL reserved keywords as table names

Open john-bodley opened this issue 4 years ago • 7 comments

There's a few examples we've seen where the SQL parser incorrectly identifies SQL reserved keywords as tables when parsing the query. Note I've worked with sqlparse in another project and I sense we probably may need to complete rewrite Superset's parser in order for accurately extracting the table names leveraging more of the core sqlparse's functionality.

Expected results

The SQL parser should identify all data warehouse tables/views with no false positives.

Actual results

The SQL parser misidentifies (under certain circumstances) SQL reserved keywords as table names.

How to reproduce the bug

>>> from superset.sql_parse import ParsedQuery
>>> ParsedQuery(sql_statement="SELECT * FROM foo bar TABLESAMPLE BERNOULLI(50);").tables
{Table(table='TABLESAMPLE', schema=None, catalog=None), Table(table='foo', schema=None, catalog=None)}

Note adding the AS keyword results in a different result:

>>> from superset.sql_parse import ParsedQuery
>>> ParsedQuery(sql_statement="SELECT * FROM foo AS bar TABLESAMPLE BERNOULLI(50);").tables
{Table(table='foo', schema=None, catalog=None)}

Environment

(please complete the following information):

  • superset version: superset version
  • python version: python --version
  • node.js version: node -v

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.

john-bodley avatar Jan 22 '21 05:01 john-bodley

@bkyryliuk and @villebro et al. any thoughts on this? Has Dropbox or Preset run into this issue?

john-bodley avatar Feb 05 '21 19:02 john-bodley

@villebro or @dpgaspar has Preset experienced this issue?

john-bodley avatar Mar 02 '21 21:03 john-bodley

Not that I'm aware of. probably these keywords are not on sqlparse, not too familiar with it, would it make sense or solve the issue to extend those keywords?

dpgaspar avatar Mar 03 '21 09:03 dpgaspar

@john-bodley I'm surprised adding that AS there makes a difference, I would have assumed it easily infers that's an alias in the first example. I assume this is a bug on sqlparse, maybe we should open a ticket there? I also haven't heard of anyone having this problem on Preset.

villebro avatar Mar 03 '21 10:03 villebro

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

stale[bot] avatar May 02 '22 06:05 stale[bot]

@john-bodley Your call to keep or close this one... it's gotten pretty old :)

rusackas avatar Feb 05 '24 18:02 rusackas

@rusackas, hopefully this issue will be resolved via @betodealmeida SIP-117.

john-bodley avatar Feb 12 '24 19:02 john-bodley

This seems like it's either fixed or stale. Closing either way, but happy to reopen if I'm mistaken.

rusackas avatar Jun 10 '24 19:06 rusackas