airflow
airflow copied to clipboard
Connection to Postgres failed due to special characters in a password
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.7.1
What happened?
"Airflow UI test connection" shows that it was successful, but when task uses PostgresHook I got connection issue (db name= TEST_ETL for example). But airflow db which we use for airflow service works fine, i.e. we are using the same server, but different db.
in UI:
in task then we got error: sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) invalid dsn: invalid connection option "extra" (Background on this error at: https://sqlalche.me/e/14/f405) [2024-03-15, 15:15:08 UTC] {taskinstance.py:1398} INFO - Marking task as FAILED. dag_id=data_ingestion, task_id=set_static_data, execution_date=20240315T151344, start_date=20240315T151431, end_date=20240315T151508 [2024-03-15, 15:15:08 UTC] {standard_task_runner.py:104} ERROR - Failed to execute job 10 for task set_static_data ((psycopg2.ProgrammingError) invalid dsn: invalid connection option "extra" (Background on this error at: https://sqlalche.me/e/14/f405); 123) [2024-03-15, 15:15:08 UTC] {local_task_job_runner.py:228} INFO - Task exited with return code 1 [2024-03-15, 15:15:08 UTC] {taskinstance.py:2776} INFO - 0 downstream tasks scheduled from follow-on schedule check
passwords which were 100% invalid:
- "+&I>[A1FEyxze-P3"
- "!&*Rr1(-_=[{<?"
I guess some special characters or combinations are not valid, or if it at first position in a password.
What you think should happen instead?
At least if UI write success in testing connection, airflow works with its own db fine too using the same password, then PostgresHook inside dags/tasks also should work as well, but seems that not
How to reproduce
Just try to create connection to database in UI and test it: possible password: +&I>[A1FEyxze-P3 !&*Rr1(-_=[{<?
and for example code in a task:
import pandas as pd
from airflow.providers.postgres.hooks.postgres import PostgresHook
psql_hook = PostgresHook(postgres_conn_id="postgres_conn_id")
data = {'product_name': ['Computer','Tablet','Monitor','Printer'],
'price': [900,300,450,150]
}
df = pd.DataFrame(data, columns= ['product_name','price'])
df .to_sql(
name="test_product",
con=psql_hook.get_sqlalchemy_engine(),
index=False,
if_exists="replace",
)
Operating System
Linux
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
deployed in k8s, connections were setup and tested via REST API and manually
Anything else?
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
I think it is because of special characters at the start of the password (i.e. first char is special char)
What do you provide as an extra dictionary for this connection? (you can hide the values if there is something confidential)
Ah.. that one happen because PostgresHook build incorrect SA URI, it just tried to convert Airflow URI to SA URI
That is not only one, let me create generic task for all known hooks which incorrectly build SA URI
~~PR to fix this issue: Here's the PR: https://github.com/apache/airflow/pull/38831 @DEworkerDE , I've already tested the fix and it's working. Please test it if you find time.~~
I just realised that the actual issue isn't the password characters. Rather, the addition of incompatible extra fields
@Taragolis , I think, there's nothing to be actually fixed for this issue. What do you think?