sql-metadata icon indicating copy to clipboard operation
sql-metadata copied to clipboard

Parser().tables fix for missing schema in ms_sql and sybase

Open davlee1972 opened this issue 1 year ago • 2 comments

Can we modify the two functions below?

Microsoft SQL Server and Sybase both allow empty db schema(s). If empty it defaults to either the user login or "dbo" depending if user.table or dbo.table exists.

This works..

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db.my_schema.my_table").tables
['my_db.my_schema.my_table']

But this does not:

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db..my_table").tables
['..my_table']

The code modifications will resolve to proper table names..

>>> from sql_metadata import Parser
>>> Parser("select 1 from my_db..my_table").tables
['my_db..my_table']

Code fix: Only step back one token if schema is missing.

    def _combine_qualified_names(self, index: int, token: SQLToken) -> None:
        """
        Combines names like <schema>.<table>.<column> or <table/sub_query>.<column>
        """
        value = token.value
        is_complex = True
        while is_complex:
            value, index, is_complex = self._combine_tokens(index=index, value=value)
        token.value = value

    def _combine_tokens(self, index: int, value: str) -> Tuple[str, int, bool]:
        """
        Checks if complex identifier is longer and follows back until it's finished
        """
        if index > 1 and str(self.non_empty_tokens[index - 1]) == ".":
            prev_value = self.non_empty_tokens[index - 2].value.strip("`").strip('"')
            """
            Only step back one token if schema is missing to get database name
            Otherwise step back two tokens to get database name
            """
            if prev_value == ".":
                index = index - 1
                value = f".{value}"
            else:
                index = index - 2
                value = f"{prev_value}.{value}"
            return value, index, True
        return value, None, False

davlee1972 avatar Mar 20 '24 19:03 davlee1972

Technical Docs..

From the documentation:

When database objects are referenced by using a one-part name, SQL Server first looks in the user's default schema. If the object is not found there, SQL Server looks next in the dbo schema. If the object is not in the dbo schema, an error is returned.

They say "one-part name" but this also applies when you reference an object with a pattern like:

[server].[database]..[name]

or

[database]..[name]

davlee1972 avatar Mar 20 '24 19:03 davlee1972

@davlee1972 - thanks! Can you prepare a PR with the fix you've proposed above?

macbre avatar May 08 '24 11:05 macbre