haystack-core-integrations icon indicating copy to clipboard operation
haystack-core-integrations copied to clipboard

URL encoding missing

Open Hansehart opened this issue 6 months ago • 2 comments

Describe the bug I´ve found an issue using PGVector with Haystack. The problem happens in the file conninfo.py in conninfo_to_dict(), when parsing an string like 'postgresql://postgres:p=ssword@postgres:5432/db' in the psycopg repo.

Reading the Postgresql Documentation, you can see that special characters like '=' in the passwords needs to be encoded in '%3D'. When tested it with the encoding, it worked without any problem.

Currently you get the error: psycopg.OperationalError: [Errno -2] Name or service not known, because the password is in this example 'p' and the hostname is 'ssword@postgres'. I created an issue in psycopg

To Reproduce It happens in this project in the file document_store.py in PgvectorDocumentStore in _ensure_db_setup. The line is: connection = Connection.connect(conn_str). To reproduce, set the var conn_str to 'postgresql://postgres:p=ssword@postgres:5432/db'

Describe your environment (please complete the following information):

  • Ubuntu 24.04
  • Haystack version: 2.13.2
  • Integration version: 3.3.0

Hansehart avatar May 13 '25 12:05 Hansehart

The issue in psycopg has been closed. However would it be an option to encode default special chars? It was a very nasty bug for me and I would prevent the pain for any other in the future, who uses special characters in his password.

Hansehart avatar May 13 '25 14:05 Hansehart

After an internal discussion, we agreed that there is not much we can do if the connection string is malformed.

TODO:

  • Check if the current error message is informative or it can be improved in this regard. (The outcome may also be to do nothing.)
  • Update docs/integration page to inform users about this point.

anakin87 avatar May 15 '25 08:05 anakin87