connector-x
connector-x copied to clipboard
special characters on password gives db error
What language are you using?
Python
What version are you using?
0.3.1
What database are you using?
PostgreSQL
What dataframe are you using?
Arrow / Polars
Can you describe your bug?
I have both "#" and "(" characters in my credentials. From the other bug reports (#319), it looks like urllib.parse.quote_plus
helps with the "#". But the "(" is now throwing "RuntimeError: db error: ERROR: syntax error at or near '('". The parenthesis is nowhere else except in my credentials.
It works fine with sqlalchemy
. Currently using polars
but doesn't seem to be on the polars side since the error gets thrown at the cx.read_sql()
portion.
What are the steps to reproduce the behavior?
See example query.
Database setup if the error only happens on specific data or data type
Example query / code
import polars as pl
import urllib
import cred
pw = urllib.parse.quote_plus(cred.password)
uri = f'postgresql://{cred.username}:{pw}@{cred.host}:{cred.port}/{cred.dbname}'
query = "SELECT * from db LIMIT 5"
df = pd.read_database_uri(query=query, uri=uri)
What is the error?
RuntimeError Traceback (most recent call last)
Cell In[35], line 24
19 query = """
20 SELECT *
21 FROM db
22 LIMIT 5
23 """
---> 24 df = pl.read_database_uri(query=query, uri = uri)
File /opt/jupyterhub/lib64/python3.8/site-packages/polars/io/database.py:707, in read_database_uri(query, uri, partition_on, partition_range, partition_num, protocol, engine, schema_overrides)
704 engine = "connectorx"
706 if engine == "connectorx":
--> 707 return _read_sql_connectorx(
708 query,
709 connection_uri=uri,
710 partition_on=partition_on,
711 partition_range=partition_range,
712 partition_num=partition_num,
713 protocol=protocol,
714 schema_overrides=schema_overrides,
715 )
716 elif engine == "adbc":
717 if not isinstance(query, str):
File /opt/jupyterhub/lib64/python3.8/site-packages/polars/io/database.py:755, in _read_sql_connectorx(query, connection_uri, partition_on, partition_range, partition_num, protocol, schema_overrides)
752 except BaseException as err:
753 # basic sanitisation of /user:pass/ credentials exposed in connectorx errs
754 errmsg = re.sub("://[^:]+:[^:]+@", "://***:***@", str(err))
--> 755 raise type(err)(errmsg) from err
757 return from_arrow(tbl, schema_overrides=schema_overrides)
RuntimeError: db error: ERROR: syntax error at or near "("
I am having this exact same issue -- have tried several versions of connectorx and polars, but cannot successfully connect to a redshift database. I get an identical error, though I don't have a "(" in my password -- I have a "[".
If I try using cx.read_sql instead of pl.read_database_uri, I get the below error message regardless of which type of dataframe I set as the return_type (and I do have pandas installed).
RuntimeError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\connectorx_init_.py in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col) 262 raise ValueError("You need to install pandas first") 263 --> 264 result = _read_sql( 265 conn, 266 "pandas",
RuntimeError: db error: ERROR: syntax error at or near "("
The raised exception lives in Polars, https://github.com/pola-rs/polars/blob/835d1981dd13b84f466b4bb3ae75e0ea347f8c4e/py-polars/polars/io/database/_utils.py#L78 but I cannot find where it comes from (the connector-x section)
docker run -e POSTGRES_PASSWORD='·$)(%·!"$%("·$(%="!![[][·$%#("·/$%!"($&' -d -p 5432:5432 --rm --name postgres postgres
import connectorx
import urllib.parse
password = '·$)(%·!"$%("·$(%="!![[][·$%#("·/$%!"($&'
cn = connectorx.read_sql(f'postgres://postgres:{urllib.parse.quote_plus(password)}@localhost:5432',
'SELECT 1;')
Works fine.
my solution:
import connectorx as cx
from urllib.parse import quote_plus
pwd = quote_plus('mypassword##')
SQLALCHEMY_GAUSSDB_PROD_CONNECT_STR = f'postgresql://myusername:{pwd}@xxx:xxx/xxx' # not working
SQLALCHEMY_GAUSSDB_PROD_CONNECT_STR = f'redshift://myusername:{pwd}@xxx:xxx/xxx' # working!
sql = "SELECT 1 FROM dual;"
start = time.time()
try:
result = cx.read_sql(SQLALCHEMY_GAUSSDB_PROD_CONNECT_STR, sql)
logger.warning(len(result))
except Exception as e:
print(f"查询失败: {e}")
But I do not know if the server has limit this connection. My database is GAUSSDB
these workarounds aren't working on redshift....ignore: redshift:// instead of postgres:// worked