sql.connect wrongly reports a timeout when attempting to connect to a non-existing warehouse

Open haschdl opened this issue 1 year ago • 0 comments

Disclaimer: I am a Databricks employee.

Calling databricks.sql.connect with a non-existing warehouse_id will raise a generic time-out exception, coming from the ThriftBackend.

It's only by digging in the source code that one can find the default of 30(!) retries - which can be set by including _retry_stop_after_attempts_count in sql.connect.

As a user of SQL Client, I'd like to easily make a distinction between a timeout (=unknown circumstances, makes more sense to retry) and a non-existing/deleted warehouse (=hard fact, makes no sense to keep trying to connect).

connection = sql.connect(
            server_hostname=server_hostname,
            http_path=f"""/sql/1.0/warehouses/{warehouse_id}""",
            credentials_provider=_credentials_provider
        )

Exception:

  
  HTTPSConnectionPool(host='[xxxx.databricks.com](http://xxxx.cloud.databricks.com/)', port=443):   
  Max retries exceeded with url: /sql/1.0/warehouses/786786d78562786  
  (Caused by ResponseError('too many 404 error responses')).

Alternatives considered

One can add another dependency to the SDK and check if the Warehouse exists. It seems overkill to add the SDK just for that purpose. It really should be part of the connect client.
One can parse the exception text and apply some heuristics to form an educated guess with the 404 reply. This is hacky and might have too many edge-cases. E.g. calling /sql/1.0/**warehoses**/xxxxxwould raise the exact same exception.

Dec 17 '24 09:12 haschdl