trino-python-client icon indicating copy to clipboard operation
trino-python-client copied to clipboard

Infinite version request loop (FastAPI with SQLAlchemy hitting a Trino docker desktop container)

Open jonroquet2 opened this issue 10 months ago • 6 comments

Expected behavior

  1. Web application starts
  2. Post request to application causes a connection to local trino server
  3. Connection makes a simple query (select * from tpch.sf1.nation) using sqlalchemy
  4. Data rows returned and logged

Code: `

engine = create_engine("trino://jon@localhost:8080/system")
conn = engine.connect()

result = conn.execute(text("select 1 as x")).fetchall()
for row in result:
    logger.debug(row)

`

Actual behavior

Infinite loop of code requests calling "select version()" hit the local trino server, and python goes into an infinite loop:

(Note, this is one of several thousand, this never stops)

INFO 2025-05-21 15:58:51,240 trino.client 1013 - failed after 1 attempts DEBUG 2025-05-21 15:58:51,243 urllib3.connectionpool 241 - Starting new HTTP connection (1269): localhost:8080 DEBUG 2025-05-21 15:58:51,255 urllib3.connectionpool 544 - http://localhost:8080 "POST /v1/statement HTTP/1.1" 200 351 DEBUG 2025-05-21 15:58:51,261 urllib3.connectionpool 544 - http://localhost:8080 "GET /v1/statement/queued/20250521_205902_04901_q83nv/ya189522748922861480aa59bd1ca7af2d04dc2fc/1 HTTP/1.1" 200 351 DEBUG 2025-05-21 15:58:51,268 urllib3.connectionpool 544 - http://localhost:8080 "GET /v1/statement/queued/20250521_205902_04901_q83nv/yae1ef533103b0d4dfe33f7c45c07325f4aa00b79/2 HTTP/1.1" 200 359 DEBUG 2025-05-21 15:58:51,276 urllib3.connectionpool 544 - http://localhost:8080 "GET /v1/statement/executing/20250521_205902_04901_q83nv/y201189cbc98d1a1f97104f718b9c6372baf9e43f/0 HTTP/1.1" 200 558 DEBUG 2025-05-21 15:58:51,284 urllib3.connectionpool 544 - http://localhost:8080 "GET /v1/statement/executing/20250521_205902_04901_q83nv/ya857eae45b999e5c9ab9fa64ff14b3a9e8a736a7/1 HTTP/1.1" 200 479

Steps To Reproduce

DBMS: Run a local docker container using trinodb/trino:latest OS: WIndows 11 Python App: Python 3.10 or 3.12 (attempted using both) Python Libraries: trino==0.334.0 sqlalchemy==2.0.38

Using the above code, attempt to call your local docker trino. Any query fails - it never gets to the actual query.

Log output

The issue is the select version call in dialect.py: def _get_server_version_info(cls, connection: Connection) -> Any: def get_server_version_info(_): query = "SELECT version()" try: res = connection.execute(sql.text(query)) version = res.scalar() return tuple([version]) except exc.ProgrammingError as e: logger.debug(f"Failed to get server version: {e.orig.message}") return None

When I replace this with "return 475", all behavior is now as expected.

Operating System

Windows 11

Trino Python client version

0.334.0

Trino Server version

475

Python version

3.12.9

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

jonroquet2 avatar May 21 '25 21:05 jonroquet2

@hashhar I'm now receiving this same problem in Trino 476 deployed via image to an AWS ECS (using a single worker).

Image

jonroquet2 avatar Jun 25 '25 15:06 jonroquet2

cannot reproduce.

from sqlalchemy import create_engine, text
import logging

logging.basicConfig(level=logging.DEBUG)

engine = create_engine("trino://jon@localhost:8080/system")
conn = engine.connect()

result = conn.execute(text("select 1 as x")).fetchall()
for row in result:
    logging.debug(row)
    # print(row)

Output looks like:

python infinite_version_loop.py
DEBUG:tzlocal:/etc/localtime found
DEBUG:tzlocal:1 found:
 {'/etc/localtime is a symlink to': 'Asia/Kolkata'}
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:8080
DEBUG:urllib3.connectionpool:http://localhost:8080 "POST /v1/statement HTTP/1.1" 200 352
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/queued/20250814_145811_00001_qnffi/y0e572f0ba2963ab3b2eddd6b93b64109f300f550/1 HTTP/1.1" 200 354
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/queued/20250814_145811_00001_qnffi/y9c469d2fb41398810dedd777393c1d36ec247878/2 HTTP/1.1" 200 359
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/executing/20250814_145811_00001_qnffi/y8f42d03d14f9f2659fa74746aefc550813959e78/0 HTTP/1.1" 200 496
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/executing/20250814_145811_00001_qnffi/ye50bb29d7b59af8139567044be323c9b1a2181eb/1 HTTP/1.1" 200 527
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/executing/20250814_145811_00001_qnffi/y1e28046e23ee081d40c50d808ddf017601a057dd/2 HTTP/1.1" 200 523
DEBUG:urllib3.connectionpool:http://localhost:8080 "GET /v1/statement/executing/20250814_145811_00001_qnffi/y2856b9f48af2f0580addf45b39e110db82ea0e94/3 HTTP/1.1" 200 448
DEBUG:root:(1,)

hashhar avatar Aug 14 '25 14:08 hashhar

actually I don't even see a SELECT version() issued in my Trino query history. Can you share more information? For example by enabling debug logs for the client so that we can see what is trigerring the queries?

hashhar avatar Aug 14 '25 15:08 hashhar

@hashhar I can work on this later this week - What level of logging do you need set where?

We are still getting this bug - We actually had to patch our dev release with a faked version of the SQLAlchemy dialect code:

Image

jonroquet2 avatar Aug 18 '25 17:08 jonroquet2

Something like adding a traceback.print_stack() into the get_server_version method to see what's triggering it to get called should help. I couldn't trigger the method being called so I couldn't debug.

hashhar avatar Aug 20 '25 12:08 hashhar

@hashhar Sorry about the delay in this.

It appears that there is some interaction with AWS XRay going on. Here's the output when I allow the looping to happen (I remove my local hack) - I killed this from task manager after about 73 loops:

loops.txt

When I run it with the patch in place, I still get slightly odd stacks where it looks like its running twice but I see no oddity:

File "X:\my_user_windir\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\threading.py", line 973, in _bootstrap self._bootstrap_inner() File "X:\my_user_windir\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "X:\my_local_path\Python\pyxplat\lib\site-packages\anyio_backends_asyncio.py", line 962, in run result = context.run(func, *args) File "X:\my_local_path_app\app\apps\live_agg_app_trino\endpoint_live_agg_trino.py", line 52, in metric_request [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\endpoint_live_agg_trino.py", line 95, in metric_request_deliverable_v2 [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\service_live_agg_trino.py", line 52, in process_metrics_to_dataframe [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\helper_live_agg_trino.py", line 55, in process_liveagg_request [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\helper_live_agg_trino.py", line 219, in __ladr_to_time_date [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_base.py", line 233, in query_for_df [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_trino.py", line 205, in query_for_df [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_trino.py", line 246, in __query_for_df df = pandas.read_sql(sqlalchemy.text(sql), conn, dtype_backend = "pyarrow") File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 734, in read_sql return pandas_sql.read_query( File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 1836, in read_query result = self.execute(sql, params) File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 1660, in execute return self.con.execute(sql, *args) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 58, in _xray_traced_sqlalchemy_execute return _process_request(wrapped, instance, args, kwargs) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 66, in _process_request name, sql = _sql_meta(engine_instance, args) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 40, in _sql_meta if engine_instance.dialect.server_version_info is not None: File "X:\my_local_path\Python\pyxplat\lib\site-packages\trino\sqlalchemy\dialect.py", line 415, in get_server_version_info traceback.print_stack() File "X:\my_user_windir\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\threading.py", line 973, in _bootstrap self._bootstrap_inner() File "X:\my_user_windir\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "X:\my_local_path\Python\pyxplat\lib\site-packages\anyio_backends_asyncio.py", line 962, in run result = context.run(func, *args) File "X:\my_local_path_app\app\apps\live_agg_app_trino\endpoint_live_agg_trino.py", line 52, in metric_request [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\endpoint_live_agg_trino.py", line 95, in metric_request_deliverable_v2 [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\service_live_agg_trino.py", line 52, in process_metrics_to_dataframe [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\helper_live_agg_trino.py", line 55, in process_liveagg_request [replaced_code] File "X:\my_local_path_app\app\apps\live_agg_app_trino\helper_live_agg_trino.py", line 219, in __ladr_to_time_date [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_base.py", line 233, in query_for_df [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_trino.py", line 205, in query_for_df [replaced_code] File "X:\my_local_path_app\app\arch\database\dbsa_trino.py", line 246, in __query_for_df df = pandas.read_sql(sqlalchemy.text(sql), conn, dtype_backend = "pyarrow") File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 734, in read_sql return pandas_sql.read_query( File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 1836, in read_query result = self.execute(sql, params) File "X:\my_local_path\Python\pyxplat\lib\site-packages\pandas\io\sql.py", line 1660, in execute return self.con.execute(sql, *args) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 58, in _xray_traced_sqlalchemy_execute return _process_request(wrapped, instance, args, kwargs) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 66, in _process_request name, sql = _sql_meta(engine_instance, args) File "X:\my_local_path\Python\pyxplat\lib\site-packages\aws_xray_sdk\ext\sqlalchemy_core\patch.py", line 41, in _sql_meta metadata['database_version'] = '.'.join(map(str, engine_instance.dialect.server_version_info)) File "X:\my_local_path\Python\pyxplat\lib\site-packages\trino\sqlalchemy\dialect.py", line 415, in get_server_version_info traceback.print_stack()

jonroquet2 avatar Sep 08 '25 19:09 jonroquet2