trino-python-client
trino-python-client copied to clipboard
Prevent error when closing an unused cursor
SQLAlchemy will attempt to close cursor objects even if, for whatever reason, they haven't actually been used and therefore have no associated query to cancel.
Release notes
( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. (*) Release notes are required, with the following suggested text:
* Support cleanly closing unused cursor
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla
Hi, this is slightly tricky to reproduce but I'm seeing it in the tracebacks from failures in CI when our test Trino instance goes down. It's something to do with SQLAlchemy's pool manager trying to close out connections after an error.
Here's the key bit of the traceback:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
cursor.close()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
self.cancel()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query
[info ] failed after 3 attempts [trino.client]
[error ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]
And here's the full thing (click to expand)
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 398, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 244, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1286, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1332, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1281, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 1041, in _send_output
self.send(msg)
File "/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/http/client.py", line 979, in send
self.connect()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 205, in connect
conn = self._new_conn()
^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
self.dialect.do_execute(
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/sqlalchemy/dialect.py", line 373, in do_execute
cursor.execute(statement, parameters)
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 573, in execute
self._prepare_statement(operation, statement_name)
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 448, in _prepare_statement
query.execute()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 806, in execute
response = self._request.post(self._query, additional_http_headers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 555, in post
http_response = self._post(
^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 899, in decorated
raise error
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/client.py", line 886, in decorated
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=32769): Max retries exceeded with url: /v1/statement (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc60f9c8310>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2199, in _safe_close_cursor
cursor.close()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 712, in close
self.cancel()
File "/home/runner/work/ehrql/ehrql/.venv/lib/python3.11/site-packages/trino/dbapi.py", line 706, in cancel
raise trino.exceptions.OperationalError(
trino.exceptions.OperationalError: Cancel query failed; no running query
[info ] failed after 3 attempts [trino.client]
[error ] Error closing cursor [sqlalchemy.pool.impl.QueuePool]
Oh, interesting. Looking at the CI failures here there's an explicit test for the behaviour which my patch tries to remove: https://github.com/trinodb/trino-python-client/blob/f98a608601f5ba88fa4ba024394d241d4f5921b2/tests/integration/test_dbapi_integration.py#L1343-L1352
But I'm not sure that is the behaviour you actually want. The test was added in fca57d4 and looks like a direct copy-paste of the test for cancel()
:
https://github.com/trinodb/trino-python-client/blob/f98a608601f5ba88fa4ba024394d241d4f5921b2/tests/integration/test_dbapi_integration.py#L1331-L1340
I think that explicitly calling cancel()
on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close()
.
I think that explicitly calling cancel() on a cursor with no query should raise an an exception, but not so for the implicit cancellation that happens on close().
Yes, that is a fair point.
cc: @mdesmet @hovaesco PTAL as well.
In my head the logic should be:
- On
cursor.close()
- Cancel any already executing queries if they exist
- Perform implicit ROLLBACK if within a transaction
- Set some boolean so that calling any method of the
cursor
leads toError
We don't do steps 2 and 3 here (both required by DB-API).
- On
cursor.cancel()
- this is not a DB-API specified method so the behaviour is up to us to specify- Cancel any already executing queries
- If query already finished, return False
- If query is running and was cancelled successfully, return True
- If no query was executed at all (how to differentiate between this and already finished? - presence of query on the cursor?) then raise Exception
- Cancel any already executing queries
@evansd Me and @hovaesco discussed this offline and we agree with the changes described in my previous comment.
So can you please change cursor.cancel()
behaviour to following:
- If query is running and was cancelled successfully, return True
- In every other case return False
- e.g. query was already finished
- e.g. no query was executed at all
And add some tests to verify this new behavior.
@evansd Any updates here?
Not yet, I'm afraid. This isn't a blocking issue for us at the moment as we've worked around the problem we had of the Trino instance going down during CI runs, so the behaviour in question no longer gets triggered.
I'm still planning to make the fix following the behaviour you've outlined; but we've got a lot of other high priority work on at the moment so it won't be for a few weeks.
Thanks for the update @evansd.
Some parts of this were addressed in https://github.com/trinodb/trino-python-client/pull/405