trino-python-client icon indicating copy to clipboard operation
trino-python-client copied to clipboard

Port incorrectly processed when using `trino.sqlalchemy.URL` and `sa.create_engine`

Open kklein opened this issue 3 months ago • 10 comments

Expected behavior

Hi!

I am able to connect to a Trino backend by using trino.dbapi.connect but not yet by using sqlalchemy.create_engine. The latter causes problems with my port. I'd much appreciate it if someone could look into this.

Concretely, running

from trino.auth import BasicAuthentication
from trino.dbapi import connect

host_url = "https://trino.apps.k8s.mydomain.com"
port = "443"
catalog = "mycatalog"
schema = "myschema"
username = "kklein"
password = "mypw"

auth = BasicAuthentication(username=username, password=password)

connection = connect(
    host=host_url,
    port=port,
    user=username,
    auth=auth,
    catalog=catalog,
    schema=schema,
)

cursor = connection.cursor()
print(cursor.execute("SELECT * FROM MYTABLE LIMIT 10").fetchall())

works just fine and prints the expected rows and columns. I would expect the same when working with sqlalchemy.Engin objects as follows:

from trino.sqlalchemy import URL
from sqlalchemy import create_engine

create_engine(URL(
    host=host_url,
    port=443, # I tried both passing an int or a string here.
    catalog=catalog,
    schema=schema,
    user=username,
    password=password,
))

Actual behavior

Yet, executing aforementioned code block leads to this ValueError:

ValueError: invalid literal for int() with base 10: ''

Steps To Reproduce

See code above. Unfortunately I don't have access to further information on the Trino server deployment.

Log output

Traceback (most recent call last):
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1777, in main
    pdb._run(target)
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1643, in _run
    self.run(target.code)
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/bdb.py", line 600, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 86, in <module>
    check_trino_connection()
  File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 79, in check_trino_connection
    engine = create_trino_engine()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 64, in create_trino_engine
    return create_engine(
           ^^^^^^^^^^^^^^
  File "<string>", line 2, in create_engine
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
    return fn(*args, **kwargs)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/create.py", line 564, in create_engine
    u = _url.make_url(url)
        ^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 856, in make_url
    return _parse_url(name_or_url)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 919, in _parse_url
    return URL.create(name, **components)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 213, in create
    cls._assert_port(port),
    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 223, in _assert_port
    return int(port)
           ^^^^^^^^^
ValueError: invalid literal for int() with base 10: ''

Moreover, I stepped into the debugger.

At this point, everything still looks as expected to me:

> /home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/create.py(564)create_engine()
-> u = _url.make_url(url)
(Pdb) url
'trino://kklein:mypw@https://trino.apps.k8s.mydomain.com:443/myschema/mycatalog?source=trino-sqlalchemy'

Then, two layers below, the port seems to be not properly extracted, see the empty string value for the "port" key.

> /home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py(919)_parse_url()
-> return URL.create(name, **components)  # type: ignore
(Pdb) components
{'username': 'kklein', 'password': 'mypw', 'port': '', 'database': '/trino.apps.k8s.mydomain.com:443/mycatalog/myschema', 'query': {'source': 'trino-sqlalchemy'}, 'host': 'https'}
(Pdb)

Operating System

Ubuntu 24.04.2 LTS

Trino Python client version

0.336.0

Trino Server version

[['476-stackable25.7.0']]

Python version

3.11.13

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

kklein avatar Sep 22 '25 12:09 kklein

can you see if the following helps:

create_engine(URL.create(
    drivername='trino'
    host=host_url,
    port=443, # I tried both passing an int or a string here.
    database=system,
    username=username,
    password=password,
))

the URL object should be created either using URL.create or make_url. You can see some examples in our tests at https://github.com/trinodb/trino-python-client/blob/2108c38dea79518ffb74370177df2dc95f1e6d96/tests/unit/sqlalchemy/test_dialect.py#L24

hashhar avatar Sep 25 '25 09:09 hashhar

Hi @hashhar -- thanks a lot for your response!

Afaict this doesn't solve the problem.

Running

from sqlalchemy import Engine, create_engine
from sqlalchemy.engine.url import make_url
from trino.sqlalchemy import URL

create_engine(
    make_url(URL(
        host=host_url,
        port=443,
        catalog=catalog,
        schema=schema,
        user=username,
        password=password,
  )),
)

leads to the same error:

Traceback (most recent call last):
  File "/home/ubuntu/xxx.pixi/envs/default/lib/python3.11/pdb.py", line 1777, in main
    pdb._run(target)
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1643, in _run
    self.run(target.code)
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/bdb.py", line 600, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 90, in <module>
    create_trino_engine()
  File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 69, in create_trino_engine
    make_url(URL(
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 856, in make_url
    return _parse_url(name_or_url)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 919, in _parse_url
    return URL.create(name, **components)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 213, in create
    cls._assert_port(port),
    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 223, in _assert_port
    return int(port)
           ^^^^^^^^^
ValueError: invalid literal for int() with base 10: ''

I can't pass the driver parameter you mentioned.

kklein avatar Sep 25 '25 10:09 kklein

You're again using catalog and schema. Those are not valid URL parameters. Please see my example or the one in tests.

hashhar avatar Sep 25 '25 10:09 hashhar

actually you can do print(make_url(....)) to see the generated URL and inspect that.

hashhar avatar Sep 25 '25 10:09 hashhar

The pattern I found in the test suite was make_url(trino_url(...)) where trino_url is an alias for trino.sqlalchemy.URL, e.g. here:

https://github.com/trinodb/trino-python-client/blob/2108c38dea79518ffb74370177df2dc95f1e6d96/tests/unit/sqlalchemy/test_dialect.py#L28

This same pattern doesn't seem to work in my case. Merely running URL -- which would be fed to make_url -- as follows

url = URL(
    # Not accepted as a paremeter.
    # driver="trino",
    host=host_url,
    port=443,
    user=username,
    password=password,
)

gives me the same port-related error again.

kklein avatar Sep 25 '25 10:09 kklein

drivername, not driver. Also if you did print(trino.sqlalchemy.URL(....)) you'll see your issue. That URL string is what gets passed to driver.

I am not able to reproduce unfortunately. It works for me, here's a snippet you can use to debug your issue:

from sqlalchemy import create_engine
from trino.sqlalchemy import URL
from sqlalchemy.sql.expression import select, text

url = URL(
        host="localhost",
        port=8080,
        catalog="memory",
        schema="default",
        user="admin",
)
print(url)
engine = create_engine(url)
connection = engine.connect()
rows = connection.execute(text("SELECT * FROM system.runtime.nodes")).fetchall()
print(rows)

hashhar avatar Sep 25 '25 11:09 hashhar

drivername, not driver.

Unfortunately, that doesn't work for me either:

    url = URL(

          ^^^^

TypeError: _url() got an unexpected keyword argument 'drivername'

Also if you did print(trino.sqlalchemy.URL(....)) you'll see your issue.

Unfortunately that's my very problem -- since URL() doesn't return but raises an exception, I can't print its return value.

kklein avatar Sep 25 '25 11:09 kklein

I figured out your issue. you are passing a URL instead of hostname in host. Remove the http://.

If you run my snippet with http://localhost in host you'll see the printed URL and what the issue is. It's invalid URI for sqlalchemy.

hashhar avatar Sep 25 '25 15:09 hashhar

This solved it - thanks a lot!!

Is there something we can do to improve the error handling? Happy to contribute.

kklein avatar Sep 25 '25 17:09 kklein

Doesn't look like it unfortunately. The error comes from SQLAlchemy library. We can try to do a urlparse on the host value and reject if there's anything except the host (e.g. port, authority, scheme, fragment etc.) which gets parsed but it can reject valid input too which SQLAlchemy might be fine with.

hashhar avatar Sep 29 '25 18:09 hashhar