trino-python-client icon indicating copy to clipboard operation
trino-python-client copied to clipboard

Prevent error when closing an unused cursor

Open evansd opened this issue 9 months ago • 10 comments

SQLAlchemy will attempt to close cursor objects even if, for whatever reason, they haven't actually been used and therefore have no associated query to cancel.

Release notes

( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. (*) Release notes are required, with the following suggested text:

* Support cleanly closing unused cursor

evansd avatar Sep 04 '23 07:09 evansd

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

cla-bot[bot] avatar Sep 04 '23 07:09 cla-bot[bot]

Hi, this is slightly tricky to reproduce but I'm seeing it in the tracebacks from failures in CI when our test Trino instance goes down. It's something to do with SQLAlchemy's pool manager trying to close out connections after an error.

Here's the key bit of the traceback:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
    cursor.close()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
    self.cancel()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
    raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query

[info   ] failed after 3 attempts [trino.client]
[error  ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]
And here's the full thing (click to expand)
Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 244, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1286, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1332, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1281, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1041, in _send_output
    self.send(msg)
  File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 979, in send
    self.connect()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
           ^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/sqlalchemy/dialect.py", line 373, in do_execute
    cursor.execute(statement, parameters)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 573, in execute
    self._prepare_statement(operation, statement_name)
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 448, in _prepare_statement
    query.execute()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 806, in execute
    response = self._request.post(self._query, additional_http_headers)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 555, in post
    http_response = self._post(
                    ^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 899, in decorated
    raise error
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 886, in decorated
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
    cursor.close()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
    self.cancel()
  File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
    raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query

[info   ] failed after 3 attempts [trino.client]
[error  ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]

evansd avatar Sep 04 '23 09:09 evansd

Oh, interesting. Looking at the CI failures here there's an explicit test for the behaviour which my patch tries to remove: https://github.com/trinodb/trino-python-client/blob/f98a608601f5ba88fa4ba024394d241d4f5921b2/tests/integration/test_dbapi_integration.py#L1343-L1352

But I'm not sure that is the behaviour you actually want. The test was added in fca57d4 and looks like a direct copy-paste of the test for cancel(): https://github.com/trinodb/trino-python-client/blob/f98a608601f5ba88fa4ba024394d241d4f5921b2/tests/integration/test_dbapi_integration.py#L1331-L1340

I think that explicitly calling cancel() on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close().

evansd avatar Sep 04 '23 10:09 evansd

I think that explicitly calling cancel() on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close().

Yes, that is a fair point.

cc: @mdesmet @hovaesco PTAL as well.

hashhar avatar Sep 04 '23 12:09 hashhar

In my head the logic should be:

  • On cursor.close()
    • Cancel any already executing queries if they exist
    • Perform implicit ROLLBACK if within a transaction
    • Set some boolean so that calling any method of the cursor leads to Error

We don't do steps 2 and 3 here (both required by DB-API).

  • On cursor.cancel() - this is not a DB-API specified method so the behaviour is up to us to specify
    • Cancel any already executing queries
      • If query already finished, return False
      • If query is running and was cancelled successfully, return True
      • If no query was executed at all (how to differentiate between this and already finished? - presence of query on the cursor?) then raise Exception

hashhar avatar Sep 04 '23 12:09 hashhar

@evansd Me and @hovaesco discussed this offline and we agree with the changes described in my previous comment.

So can you please change cursor.cancel() behaviour to following:

  • If query is running and was cancelled successfully, return True
  • In every other case return False
    • e.g. query was already finished
    • e.g. no query was executed at all

And add some tests to verify this new behavior.

hashhar avatar Sep 15 '23 09:09 hashhar

@evansd Any updates here?

hashhar avatar Oct 13 '23 08:10 hashhar

Not yet, I'm afraid. This isn't a blocking issue for us at the moment as we've worked around the problem we had of the Trino instance going down during CI runs, so the behaviour in question no longer gets triggered.

I'm still planning to make the fix following the behaviour you've outlined; but we've got a lot of other high priority work on at the moment so it won't be for a few weeks.

evansd avatar Oct 13 '23 08:10 evansd

Thanks for the update @evansd.

hashhar avatar Oct 13 '23 08:10 hashhar

Some parts of this were addressed in https://github.com/trinodb/trino-python-client/pull/405

hashhar avatar Dec 22 '23 09:12 hashhar