haystack-core-integrations
haystack-core-integrations copied to clipboard
URL encoding missing
Describe the bug I´ve found an issue using PGVector with Haystack. The problem happens in the file conninfo.py in conninfo_to_dict(), when parsing an string like 'postgresql://postgres:p=ssword@postgres:5432/db' in the psycopg repo.
Reading the Postgresql Documentation, you can see that special characters like '=' in the passwords needs to be encoded in '%3D'. When tested it with the encoding, it worked without any problem.
Currently you get the error: psycopg.OperationalError: [Errno -2] Name or service not known, because the password is in this example 'p' and the hostname is 'ssword@postgres'. I created an issue in psycopg
To Reproduce It happens in this project in the file document_store.py in PgvectorDocumentStore in _ensure_db_setup. The line is: connection = Connection.connect(conn_str). To reproduce, set the var conn_str to 'postgresql://postgres:p=ssword@postgres:5432/db'
Describe your environment (please complete the following information):
- Ubuntu 24.04
- Haystack version: 2.13.2
- Integration version: 3.3.0
The issue in psycopg has been closed. However would it be an option to encode default special chars? It was a very nasty bug for me and I would prevent the pain for any other in the future, who uses special characters in his password.
After an internal discussion, we agreed that there is not much we can do if the connection string is malformed.
TODO:
- Check if the current error message is informative or it can be improved in this regard. (The outcome may also be to do nothing.)
- Update docs/integration page to inform users about this point.