databricks-sql-python icon indicating copy to clipboard operation
databricks-sql-python copied to clipboard

Slowness in query submission and response.

Open blanketspy99 opened this issue 1 year ago • 10 comments

While running the databricks query, the query submission and response is taking a lot of time to return back although actual runtime in the query history is within a second. But, the python gets data back after 3-5 minutes.

I'm running a sequence of 5 queries which opens and closes session after each one and it's ultimately delaying up to 15 minutes which in actuality shall complete in 5-10 seconds. The workspace is AWS private linked one.

blanketspy99 avatar Aug 05 '24 18:08 blanketspy99

the query submission and response is taking a lot of time

opens and closes session after each one

Could you share your code? Is the delay occurring at the connect or execute stage?

jaames-bentley avatar Aug 06 '24 16:08 jaames-bentley

def execute_databricks_query(query):
  """
  Function to execute a query inside Databricks and returns back the result.
  Expects following environmental variables to be defined first,
  DATABRICKS_SERVER_HOSTNAME, DATABRICKS_HTTP_PATH, DATABRICKS_TOKEN
  """
  with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
                  http_path       = os.getenv("DATABRICKS_HTTP_PATH"),
                  access_token    = os.getenv("DATABRICKS_TOKEN")) as connection:

    with connection.cursor() as cursor:
      cursor.execute(query)
      # cursor.execute("SELECT * FROM lineage_catalog_qa.admin.meta_control_table")
      result = cursor.fetchall()
      return result
    

I'm using the above function.

The slowness is happening at connect stage, the execution is fast and then the response back is taking time again.

Note: Our environment is a AWS Privatelinked one and network flow is within VPC to Databricks. This issue is from past 1-2 weeks.

shahrukh-shaik avatar Aug 08 '24 13:08 shahrukh-shaik

What version of Python are you using? I opened #422 the other day regarding slow connection and found it could be solved by using a different Python version.

Either way, if you run the function in debug mode, have you traced to call stack to see where the delay is occurring?

jaames-bentley avatar Aug 08 '24 13:08 jaames-bentley

I was trying with python 3.7 and later moved to python3.8 as I see python3.8 supports the latest sql-connector version 3.3. Moving from python3.7 to python3.8 reduced my query time to half.

And, I see first 2 queries out of 5 mostly identical are running quick and then the delay starts.

I have tried the debug too, will try to share the debug logs here, if it can helps.

shahrukh-shaik avatar Aug 08 '24 14:08 shahrukh-shaik

22:06:12  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:06:12  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:06:12  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:06:12  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:06:12  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:06:12  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:06:12  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:06:12  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:06:12  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:06:12  INFO :: Successfully opened session 01ef55a4-52bc-1323-8ab1-f0b3564ad9e9
22:06:12  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:06:13  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:06:13  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:06:13  INFO :: Closing session 01ef55a4-52bc-1323-8ab1-f0b3564ad9e9
22:06:13  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:06:14  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:06:14  DEBUG :: Received response: TCloseSessionResp(<REDACTED>)
22:06:14  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:06:14  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:06:14  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:06:14  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:06:14  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:06:14  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:06:14  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:08:35  INFO :: Successfully opened session 01ef55a4-a14e-15ab-b620-038a35e8489e
22:08:35  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:08:35  INFO :: Closing session 01ef55a4-a14e-15ab-b620-038a35e8489e
22:08:35  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TCloseSessionResp(<REDACTED>)
22:08:35  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:08:35  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:08:35  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:08:35  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:08:35  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:08:35  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:08:35  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:08:35  INFO :: Successfully opened session 01ef55a4-a1b8-15ca-84e0-0776f5bfcdc7
22:08:35  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:08:35  INFO :: Closing session 01ef55a4-a1b8-15ca-84e0-0776f5bfcdc7
22:08:35  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:08:35  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:08:35  DEBUG :: Received response: TCloseSessionResp(<REDACTED>)
22:08:35  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:08:35  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:08:35  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:08:35  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:08:35  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:08:35  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:08:35  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:10:41  INFO :: Successfully opened session 01ef55a4-ef7f-1a9b-b4d9-0ca1310959f7
22:10:41  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:10:41  INFO :: Closing session 01ef55a4-ef7f-1a9b-b4d9-0ca1310959f7
22:10:41  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TCloseSessionResp(<REDACTED>)
22:10:41  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:10:41  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:10:41  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:10:41  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:10:41  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:10:41  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:10:41  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:10:41  INFO :: Successfully opened session 01ef55a4-f07c-1e49-a0ad-ace3dacf8680
22:10:41  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:10:41  INFO :: Closing session 01ef55a4-f07c-1e49-a0ad-ace3dacf8680
22:10:41  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:10:41  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:10:41  DEBUG :: Received response: TCloseSessionResp(<REDACTED>)
22:10:41  DEBUG :: retry parameter: _retry_delay_min given_or_default 1.0
22:10:41  DEBUG :: retry parameter: _retry_delay_max given_or_default 60.0
22:10:41  DEBUG :: retry parameter: _retry_stop_after_attempts_count given_or_default 30
22:10:41  DEBUG :: retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
22:10:41  DEBUG :: retry parameter: _retry_delay_default given_or_default 5.0
22:10:41  DEBUG :: Sending request: OpenSession(<REDACTED>)
22:10:41  DEBUG :: Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443
22:13:02  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:13:02  DEBUG :: Received response: TOpenSessionResp(<REDACTED>)
22:13:02  INFO :: Successfully opened session 01ef55a5-3ed9-12ed-8b0b-f8c0a3f00c82
22:13:02  DEBUG :: Sending request: ExecuteStatement(<REDACTED>)
22:13:02  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None
22:13:02  DEBUG :: Received response: TExecuteStatementResp(<REDACTED>)
22:13:02  INFO :: Closing session 01ef55a5-3ed9-12ed-8b0b-f8c0a3f00c82
22:13:02  DEBUG :: Sending request: CloseSession(<REDACTED>)
22:13:02  DEBUG :: https://my-org-redacted-dev.cloud.databricks.com:443 "POST /sql/1.0/warehouses/aebf2c8fc913084f HTTP/1.1" 200 None

@jaames-bentley -> here is the debug version of it. I see a few are quick and other calls has a delay of at least 2-3 minutes.

I would also give a try with latest version of python 3.10+ to verify the same.

shahrukh-shaik avatar Aug 08 '24 16:08 shahrukh-shaik

@shahrukh-shaik thank you for the log. It's actually interesting. I noticed "Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443" lines. They actually come from the urllib3. First time, when OpenSession command is issued by sql.connect(), the response is almost instant. Then, I think, the same connection is re-used for other Thrift commands. But every subsequent attempt takes tens of seconds. I'm wondering if that's something with urllib3, or the way we use it? 🤔

kravets-levko avatar Aug 08 '24 18:08 kravets-levko

@shahrukh-shaik thank you for the log. It's actually interesting. I noticed "Starting new HTTPS connection (1): my-org-redacted-dev.cloud.databricks.com:443" lines. They actually come from the urllib3. First time, when OpenSession command is issued by sql.connect(), the response is almost instant. Then, I think, the same connection is re-used for other Thrift commands. But every subsequent attempt takes tens of seconds. I'm wondering if that's something with urllib3, or the way we use it? 🤔

Not sure, if you can see the above function for query execution, I'm using with clause which creates a new session every time. Is this something you are talking about?

PS: i have tried python3.10 with both urllib3<2 >=2 versions and haven't seen any difference.

blanketspy99 avatar Aug 09 '24 04:08 blanketspy99

Is this something you are talking about?

Yes, exactly

kravets-levko avatar Aug 09 '24 10:08 kravets-levko

@kravets-levko

dbx_connection=sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
                http_path       = os.getenv("DATABRICKS_HTTP_PATH"),
                access_token    = os.getenv("DATABRICKS_TOKEN")) 


def execute_databricks_query(query,connection=dbx_connection):
  """
  Function to execute a query inside Databricks and returns back the result.
  Expects following environmental variables to be defined first,
  DATABRICKS_SERVER_HOSTNAME, DATABRICKS_HTTP_PATH, DATABRICKS_TOKEN
  """
# commenting above code as there is a issue with slowness https://github.com/databricks/databricks-sql-python/issues/421 
#   with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
#                   http_path       = os.getenv("DATABRICKS_HTTP_PATH"),
#                   access_token    = os.getenv("DATABRICKS_TOKEN")) as connection:
  with dbx_connection.cursor() as cursor:
    cursor.execute(query)
    # cursor.execute("SELECT * FROM lineage_catalog_qa.admin.meta_control_table")
    result = cursor.fetchall()
    return result
    

I have made the code change to open the connection only once and closing this at the end of the script using

if dbx_connection.open:
  dbx_connection.close()

Although I tried without closing it as I see in the source code, we have __del__ method, but it was throwing some error as below

DEBUG :: Closing unclosed connection for session 01ef5634-50a9-118c-bebe-4c63fd8cc823
--- Logging error ---
Traceback (most recent call last):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 69, in emit
    if self.shouldRollover(record):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 183, in shouldRollover
    self.stream = self._open()
  File "/opt/python/3.8.18/lib/python3.8/logging/__init__.py", line 1176, in _open
    return open(self.baseFilename, self.mode, encoding=self.encoding)
NameError: name 'open' is not defined
Call stack:
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 283, in __del__
    logger.debug(
Message: 'Closing unclosed connection for session 01ef5634-50a9-118c-bebe-4c63fd8cc823'
Arguments: ()
INFO :: Closing session 01ef5634-50a9-118c-bebe-4c63fd8cc823
--- Logging error ---
Traceback (most recent call last):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 69, in emit
    if self.shouldRollover(record):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 183, in shouldRollover
    self.stream = self._open()
  File "/opt/python/3.8.18/lib/python3.8/logging/__init__.py", line 1176, in _open
    return open(self.baseFilename, self.mode, encoding=self.encoding)
NameError: name 'open' is not defined
Call stack:
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 288, in __del__
    self._close(close_cursors=False)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 354, in _close
    logger.info(f"Closing session {self.get_session_id_hex()}")
Message: 'Closing session 01ef5634-50a9-118c-bebe-4c63fd8cc823'
Arguments: ()
DEBUG :: Sending request: CloseSession(<REDACTED>)
--- Logging error ---
Traceback (most recent call last):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 69, in emit
    if self.shouldRollover(record):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 183, in shouldRollover
    self.stream = self._open()
  File "/opt/python/3.8.18/lib/python3.8/logging/__init__.py", line 1176, in _open
    return open(self.baseFilename, self.mode, encoding=self.encoding)
NameError: name 'open' is not defined
Call stack:
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 288, in __del__
    self._close(close_cursors=False)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 359, in _close
    self.thrift_backend.close_session(self._session_handle)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 591, in close_session
    self.make_request(self._client.CloseSession, req)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 497, in make_request
    response_or_error_info = attempt_request(attempt)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 398, in attempt_request
    logger.debug("Sending request: {}(<REDACTED>)".format(this_method_name))
Message: 'Sending request: CloseSession(<REDACTED>)'
Arguments: ()
INFO :: Error during request to server: {"method": "CloseSession", "session-id": "b'\\x01\\xefV4P\\xa9\\x11\\x8c\\xbe\\xbeLc\\xfd\\x8c\\xc8#'", "query-id": null, "http-code": 200, "error-message": "", "original-exception": "sys.meta_path is None, Python is likely shutting down", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.0009160041809082031/900.0"}
--- Logging error ---
Traceback (most recent call last):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 69, in emit
    if self.shouldRollover(record):
  File "/opt/python/3.8.18/lib/python3.8/logging/handlers.py", line 183, in shouldRollover
    self.stream = self._open()
  File "/opt/python/3.8.18/lib/python3.8/logging/__init__.py", line 1176, in _open
    return open(self.baseFilename, self.mode, encoding=self.encoding)
NameError: name 'open' is not defined
Call stack:
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 288, in __del__
    self._close(close_cursors=False)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/client.py", line 359, in _close
    self.thrift_backend.close_session(self._session_handle)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 591, in close_session
    self.make_request(self._client.CloseSession, req)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 509, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/var/lib/jenkins/.local/lib/python3.8/site-packages/databricks/sql/thrift_backend.py", line 337, in _handle_request_error
    logger.info(network_request_error.message_with_context())
Message: 'Error during request to server: {"method": "CloseSession", "session-id": "b\'\\\\x01\\\\xefV4P\\\\xa9\\\\x11\\\\x8c\\\\xbe\\\\xbeLc\\\\xfd\\\\x8c\\\\xc8#\'", "query-id": null, "http-code": 200, "error-message": "", "original-exception": "sys.meta_path is None, Python is likely shutting down", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.0009160041809082031/900.0"}'
Arguments: ()

shahrukh-shaik avatar Aug 09 '24 10:08 shahrukh-shaik

https://stackoverflow.com/questions/64679139/nameerror-name-open-is-not-defined-when-trying-to-log-to-files

blanketspy99 avatar Aug 12 '24 21:08 blanketspy99