airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Extra parameters in MSSQL Hook breaks sqlalchemy connections

Open emredjan opened this issue 6 months ago • 3 comments

Apache Airflow Provider(s)

microsoft-mssql

Versions of Apache Airflow Providers

apache-airflow-providers-microsoft-mssql==4.4.2

Apache Airflow version

2.10.5

Operating System

RHEL 8.10

Deployment

Virtualenv installation

Deployment details

No response

What happened

Hi. the newly added feature in #44310 breaks existing connections if driver or encrypt parameters are used in the extras:

Probably more parameters may exhibit similar errors. These extras are needed by the MSSQL hook, but we cannot blindly accept these will work in pymssql connections.

Example connection:

{
  "conn_type": "mssql",
  "extra": {
    "driver": "ODBC Driver 18 for SQL Server",
    "encrypt": "yes",
    "sqlalchemy_scheme": "mssql+pyodbc"
  },
    "host": "xxx",
    "login": "xxx",
    "password": "xxx",
    "port": 1433,
    "schema": "master"
}

Errors when using an SQLQueryExecutor using the above connection:

File "src/pymssql/_pymssql.pyx", line 586, in pymssql._pymssql.connect
TypeError: connect() got an unexpected keyword argument 'driver'

File "src/pymssql/_pymssql.pyx", line 586, in pymssql._pymssql.connect
TypeError: connect() got an unexpected keyword argument 'encrypt'

What you think should happen instead

These conenctions should continue working as they did in apache-airflow-providers-microsoft-mssql==3.9.2

How to reproduce

Have MSSQL connections defined with extra parameters Install apache-airflow-providers-microsoft-mssql >= 4.0.0 Try an mssql hook using sqlalchemy

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

emredjan avatar Jun 11 '25 08:06 emredjan

cc @jx2lee

eladkal avatar Jun 11 '25 09:06 eladkal

@eladkal I’ll take a look at it tomorrow!

@emredjan Thanks for the report! Planning to fix it soon — but happy to do Pull request yourself if it’s time-sensitive! I can assist

jx2lee avatar Jun 11 '25 14:06 jx2lee

Thanks @jx2lee , it's not that urgent as we can use 3.9.2 without issues. I may have a look at fixing it, but not really sure what's the best way. Reverting the change, or adding an error checking for extras that are not supported in pymssql, or some other option..

emredjan avatar Jun 12 '25 14:06 emredjan

@emredjan Instead of expanding MsSqlHook to accept ODBC-specific options, why not lean on the hooks that already match each driver?

  • pymssql → MsSqlHook
    The hook defaults to the mssql+pymssql dialect and is tuned for that stack.
  • pyodbc → OdbcHook
    OdbcHook uses pyodbc under the hood and already exposes a driver parameter via connection extras/hook params, so no extra work is needed.

Keeping the hooks focused means:

  1. We avoid adding more branching logic to MsSqlHook.
  2. Users choose the hook that matches their driver, which is consistent with how other database providers work.

What’s your take? Assuming that's reasonable, would you be open to switching to OdbcHook and resolving this issue?

jx2lee avatar Jun 29 '25 12:06 jx2lee

@jx2lee the issue reports a regression. If the current behavior is OK we should at least add entry in the changelog to explain users how to migrate.

eladkal avatar Jul 01 '25 08:07 eladkal

the issue reports a regression. If the current behavior is OK we should at least add entry in the changelog to explain users how to migrate.

@eladkal Thanks for quick response. I believe this behavior is functioning as intended.

I'm not sure how to add the migration guidance to the changelog—Could someone direct me to the relevant docs or examples? If I have the resources, I can handle it.

Thanks!

jx2lee avatar Jul 01 '25 09:07 jx2lee

@jx2lee It seems like using the ODBC connection type is the way to go. There are some implementation differences as they're used in common SQL operators, but it shouldn't take that much to migrate.

Thanks for the suggestions.

emredjan avatar Jul 01 '25 10:07 emredjan

I'm not sure how to add the migration guidance to the changelog—Could someone direct me to the relevant docs or examples? If I have the resources, I can handle it.

As simple as add a warnning clause in the changelog.rst for the version that caused the regression and just write a short paragraph about how to mitigate the issue.

eladkal avatar Jul 01 '25 10:07 eladkal

@eladkal Thanks for letting me know. I'll make PR, and assign this to me. 🙇🏽

jx2lee avatar Jul 01 '25 13:07 jx2lee