Port incorrectly processed when using `trino.sqlalchemy.URL` and `sa.create_engine`
Expected behavior
Hi!
I am able to connect to a Trino backend by using trino.dbapi.connect but not yet by using sqlalchemy.create_engine. The latter causes problems with my port. I'd much appreciate it if someone could look into this.
Concretely, running
from trino.auth import BasicAuthentication
from trino.dbapi import connect
host_url = "https://trino.apps.k8s.mydomain.com"
port = "443"
catalog = "mycatalog"
schema = "myschema"
username = "kklein"
password = "mypw"
auth = BasicAuthentication(username=username, password=password)
connection = connect(
host=host_url,
port=port,
user=username,
auth=auth,
catalog=catalog,
schema=schema,
)
cursor = connection.cursor()
print(cursor.execute("SELECT * FROM MYTABLE LIMIT 10").fetchall())
works just fine and prints the expected rows and columns. I would expect the same when working with sqlalchemy.Engin objects as follows:
from trino.sqlalchemy import URL
from sqlalchemy import create_engine
create_engine(URL(
host=host_url,
port=443, # I tried both passing an int or a string here.
catalog=catalog,
schema=schema,
user=username,
password=password,
))
Actual behavior
Yet, executing aforementioned code block leads to this ValueError:
ValueError: invalid literal for int() with base 10: ''
Steps To Reproduce
See code above. Unfortunately I don't have access to further information on the Trino server deployment.
Log output
Traceback (most recent call last):
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1777, in main
pdb._run(target)
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1643, in _run
self.run(target.code)
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/bdb.py", line 600, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 86, in <module>
check_trino_connection()
File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 79, in check_trino_connection
engine = create_trino_engine()
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 64, in create_trino_engine
return create_engine(
^^^^^^^^^^^^^^
File "<string>", line 2, in create_engine
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
return fn(*args, **kwargs) # type: ignore[no-any-return]
^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/create.py", line 564, in create_engine
u = _url.make_url(url)
^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 856, in make_url
return _parse_url(name_or_url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 919, in _parse_url
return URL.create(name, **components) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 213, in create
cls._assert_port(port),
^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 223, in _assert_port
return int(port)
^^^^^^^^^
ValueError: invalid literal for int() with base 10: ''
Moreover, I stepped into the debugger.
At this point, everything still looks as expected to me:
> /home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/create.py(564)create_engine()
-> u = _url.make_url(url)
(Pdb) url
'trino://kklein:mypw@https://trino.apps.k8s.mydomain.com:443/myschema/mycatalog?source=trino-sqlalchemy'
Then, two layers below, the port seems to be not properly extracted, see the empty string value for the "port" key.
> /home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py(919)_parse_url()
-> return URL.create(name, **components) # type: ignore
(Pdb) components
{'username': 'kklein', 'password': 'mypw', 'port': '', 'database': '/trino.apps.k8s.mydomain.com:443/mycatalog/myschema', 'query': {'source': 'trino-sqlalchemy'}, 'host': 'https'}
(Pdb)
Operating System
Ubuntu 24.04.2 LTS
Trino Python client version
0.336.0
Trino Server version
[['476-stackable25.7.0']]
Python version
3.11.13
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
can you see if the following helps:
create_engine(URL.create(
drivername='trino'
host=host_url,
port=443, # I tried both passing an int or a string here.
database=system,
username=username,
password=password,
))
the URL object should be created either using URL.create or make_url. You can see some examples in our tests at https://github.com/trinodb/trino-python-client/blob/2108c38dea79518ffb74370177df2dc95f1e6d96/tests/unit/sqlalchemy/test_dialect.py#L24
Hi @hashhar -- thanks a lot for your response!
Afaict this doesn't solve the problem.
Running
from sqlalchemy import Engine, create_engine
from sqlalchemy.engine.url import make_url
from trino.sqlalchemy import URL
create_engine(
make_url(URL(
host=host_url,
port=443,
catalog=catalog,
schema=schema,
user=username,
password=password,
)),
)
leads to the same error:
Traceback (most recent call last):
File "/home/ubuntu/xxx.pixi/envs/default/lib/python3.11/pdb.py", line 1777, in main
pdb._run(target)
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/pdb.py", line 1643, in _run
self.run(target.code)
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/bdb.py", line 600, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 90, in <module>
create_trino_engine()
File "/home/ubuntu/xxx/xxx/connectors/trino_connector.py", line 69, in create_trino_engine
make_url(URL(
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 856, in make_url
return _parse_url(name_or_url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 919, in _parse_url
return URL.create(name, **components) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 213, in create
cls._assert_port(port),
^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/xxx/.pixi/envs/default/lib/python3.11/site-packages/sqlalchemy/engine/url.py", line 223, in _assert_port
return int(port)
^^^^^^^^^
ValueError: invalid literal for int() with base 10: ''
I can't pass the driver parameter you mentioned.
You're again using catalog and schema. Those are not valid URL parameters.
Please see my example or the one in tests.
actually you can do print(make_url(....)) to see the generated URL and inspect that.
The pattern I found in the test suite was make_url(trino_url(...)) where trino_url is an alias for trino.sqlalchemy.URL, e.g. here:
https://github.com/trinodb/trino-python-client/blob/2108c38dea79518ffb74370177df2dc95f1e6d96/tests/unit/sqlalchemy/test_dialect.py#L28
This same pattern doesn't seem to work in my case. Merely running URL -- which would be fed to make_url -- as follows
url = URL(
# Not accepted as a paremeter.
# driver="trino",
host=host_url,
port=443,
user=username,
password=password,
)
gives me the same port-related error again.
drivername, not driver. Also if you did print(trino.sqlalchemy.URL(....)) you'll see your issue.
That URL string is what gets passed to driver.
I am not able to reproduce unfortunately. It works for me, here's a snippet you can use to debug your issue:
from sqlalchemy import create_engine
from trino.sqlalchemy import URL
from sqlalchemy.sql.expression import select, text
url = URL(
host="localhost",
port=8080,
catalog="memory",
schema="default",
user="admin",
)
print(url)
engine = create_engine(url)
connection = engine.connect()
rows = connection.execute(text("SELECT * FROM system.runtime.nodes")).fetchall()
print(rows)
drivername, not driver.
Unfortunately, that doesn't work for me either:
url = URL(
^^^^
TypeError: _url() got an unexpected keyword argument 'drivername'
Also if you did print(trino.sqlalchemy.URL(....)) you'll see your issue.
Unfortunately that's my very problem -- since URL() doesn't return but raises an exception, I can't print its return value.
I figured out your issue. you are passing a URL instead of hostname in host. Remove the http://.
If you run my snippet with http://localhost in host you'll see the printed URL and what the issue is. It's invalid URI for sqlalchemy.
This solved it - thanks a lot!!
Is there something we can do to improve the error handling? Happy to contribute.
Doesn't look like it unfortunately. The error comes from SQLAlchemy library. We can try to do a urlparse on the host value and reject if there's anything except the host (e.g. port, authority, scheme, fragment etc.) which gets parsed but it can reject valid input too which SQLAlchemy might be fine with.