airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Fix all deprecations for SQLAlchemy 2.0

Open potiuk opened this issue 2 years ago β€’ 40 comments

Body

Airflow is currently not compatible with SQLAlchemy 2.0 which is about to be released. We need to make a deliberate effort to support it.

Here are some info to aid in this effort:

  • Description of all removed featuers in SQLAlchemy 2.0: https://sqlalche.me/e/b8d9

  • The way to see all the deprecations in latest version of SQLAlchemy 1.4 is to set environment variable SQLALCHEMY_WARN_20=1

  • How to start it:

    • add SQLALCHEMY_WARN_20=1 to ci.yml at top level
    • look through warnings.txt in tests to investigate the warnings (there will also be errors in the provider imports)
    • when all of them are fixed - remove <2.0 limitation in sqlalchemy in setup.cfg
    • make sure CI is green

Local setup in PyCharm

If you use PyCharm for run tests, you might want to setup SQLALCHEMY_WARN_20=1 for all pytest runs by default

  1. Go to Run -> Edit Configurations... -> click on Edit configuration templates... image
  2. In open window select Python Tests -> pytest and put SQLALCHEMY_WARN_20=1 into the environment variable section image
  3. Click Apply, Click OK
  4. Thats it, every new pytest run configuration will have this option by default

Known non-compatible with SA 20

Reported in core

  • [x] airflow/models/taskinstance.py:1869 sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/models/taskinstance.py:1871 sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/www/views.py:845 sqlalchemy.exc.RemovedIn20Warning - The "columns" argument to Select.with_only_columns(), when referring to a sequence of items, is now passed as a series of positional elements, rather than as a list. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/www/views.py:3455 sqlalchemy.exc.RemovedIn20Warning - The Row.keys() method is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Use the namedtuple standard accessor Row._fields, or for full mapping behavior use row._mapping.keys() (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/www/views.py:3455 sqlalchemy.exc.RemovedIn20Warning - Retrieving row members using strings or other non-integers is deprecated; use row._mapping for a dictionary interface to the row (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/www/views.py:3293 sqlalchemy.exc.RemovedIn20Warning - The Row.keys() method is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Use the namedtuple standard accessor Row._fields, or for full mapping behavior use row._mapping.keys() (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/www/views.py:3293 sqlalchemy.exc.RemovedIn20Warning - Retrieving row members using strings or other non-integers is deprecated; use row._mapping for a dictionary interface to the row (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/jobs/scheduler_job_runner.py:1642 sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/models/trigger.py:141 sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/utils/db_cleanup.py:200 sqlalchemy.exc.RemovedIn20Warning - The bind argument for schema methods that invoke SQL against an engine or connection will be required in SQLAlchemy 2.0. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] airflow/cli/commands/task_command.py:202 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [x] airflow/models/taskinstance.py:524 sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

Reported in providers

  • [ ] airflow/providers/openlineage/utils/sql.py:152 sqlalchemy.exc.RemovedIn20Warning - The MetaData.bind argument is deprecated and will be removed in SQLAlchemy 2.0. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] airflow/providers/fab/auth_manager/security_manager/override.py:1509 sqlalchemy.exc.RemovedIn20Warning - "User" object is being merged into a Session along the backref cascade path for relationship "Role.user"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

Reported in tests

  • [ ] tests/providers/fab/auth_manager/api_endpoints/test_user_schema.py:63 sqlalchemy.exc.RemovedIn20Warning - "User" object is being merged into a Session along the backref cascade path for relationship "Role.user"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/jobs/test_backfill_job.py:1912 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/models/test_dagrun.py:2090 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/models/test_taskinstance.py:1487 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/ti_deps/deps/test_trigger_rule_dep.py:140 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/utils/test_log_handlers.py:475 sqlalchemy.exc.RemovedIn20Warning - "Trigger" object is being merged into a Session along the backref cascade path for relationship "TaskInstance.trigger"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/api_experimental/common/test_mark_tasks.py:137 sqlalchemy.exc.RemovedIn20Warning - The eagerload construct is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Please use joinedload(). (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/api_connexion/endpoints/test_task_instance_endpoint.py:190 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/api_connexion/schemas/test_task_instance_schema.py:69 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/api_connexion/schemas/test_task_instance_schema.py:113 sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/serialization/test_pydantic_models.py:162 sqlalchemy.exc.RemovedIn20Warning - "DagScheduleDatasetReference" object is being merged into a Session along the backref cascade path for relationship "DatasetModel.consuming_dags"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • [ ] tests/serialization/test_pydantic_models.py:163 sqlalchemy.exc.RemovedIn20Warning - "TaskOutletDatasetReference" object is being merged into a Session along the backref cascade path for relationship "DatasetModel.producing_tasks"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

Committer

  • [X] I acknowledge that I am a maintainer/committer of the Apache Airflow project.

potiuk avatar Jan 04 '23 10:01 potiuk

Just clarification. Our target support both 1.4.x and 2.0 or only 2.0?

Taragolis avatar Jan 04 '23 13:01 Taragolis

Just clarification. Our target support both 1.4.x and 2.0 or only 2.0?

This is a good point, and I do not know the answer. I think it will entirely depend on:

a) when we will do it b) how difficult it will be to keep compatibility c) some indication of the adoption of 2.0 once it is out there

I would refrain from trying to figure the answer before the final 2.0 release is out there - and possibly we should even wait for at least 2.0.1 or so, generally we should scout for stability indications. And then we should attempt to migrate and see how difficult it is to keep it compatible with both.

If we will decide to support both, due to many differences and generally backwards incompatible release of sqlalchemy I would be only heppy with supporting both if we extend our test suite in CI and add another test matrix dimention - sqlalchemy_version (on top of backend + backend version, Python version) that we have now.

But we will also have to see if that is really needed, depending on how complex the fixing will be and how different the two versions are. If the differences are small, we can likely get-by without having the extra matrix dimension.

If we see that differences are big, I think going 2.0-only is a better approach (and it basically means we will have to wait at least 6-10 months after release depending on adoption IMHO).

There is absolutely no hurry with migration yet, 1.4 will be out there for a long while and it will be supported with bugfixes - so we do not have (and will not have for a while) particular need to migrate to 2.0.

potiuk avatar Jan 04 '23 15:01 potiuk

Migrate 2.0 to 1.4 potentially could have more effect rather than we have with other packages. Some of dependency of providers also use SQLAlchemy and we need to wait until them migrate to 2.0

Current packages which explicit have SQLAlchemy as dependency from CI image.

root@6e6ad04bdb9b:/opt/airflow# pipdeptree --reverse --packages SQLAlchemy --python `which python`

sqlalchemy==1.4.46
  - alembic==1.9.1 [requires: SQLAlchemy>=1.3.0]
    - apache-airflow==2.6.0.dev0 [requires: alembic>=1.6.3,<2.0]
  - apache-airflow==2.6.0.dev0 [requires: sqlalchemy>=1.4,<2.0]
  - elasticsearch-dbapi==0.2.9 [requires: sqlalchemy]
  - eralchemy2==1.3.6 [requires: SQLAlchemy>=1.3.19]
  - Flask-AppBuilder==4.1.4 [requires: SQLAlchemy<1.5]
    - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - Flask-SQLAlchemy==2.5.1 [requires: SQLAlchemy>=0.8.0]
    - Flask-AppBuilder==4.1.4 [requires: Flask-SQLAlchemy>=2.4,<3]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - marshmallow-sqlalchemy==0.26.1 [requires: SQLAlchemy>=1.2.0]
    - Flask-AppBuilder==4.1.4 [requires: marshmallow-sqlalchemy>=0.22.0,<0.27.0]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - snowflake-sqlalchemy==1.4.4 [requires: sqlalchemy>=1.4.0,<2.0.0]
  - sqlalchemy-bigquery==1.5.0 [requires: sqlalchemy>=1.2.0,<2.0.0dev]
  - sqlalchemy-drill==1.1.2 [requires: sqlalchemy]
  - SQLAlchemy-JSONField==1.0.1.post0 [requires: sqlalchemy]
    - apache-airflow==2.6.0.dev0 [requires: sqlalchemy-jsonfield>=1.0]
  - sqlalchemy-redshift==0.8.12 [requires: SQLAlchemy>=0.9.2,<2.0.0]
  - SQLAlchemy-Utils==0.39.0 [requires: SQLAlchemy>=1.3]
    - Flask-AppBuilder==4.1.4 [requires: sqlalchemy-utils>=0.32.21,<1]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]

This is not complete list because some of packages could use sqlalchemy but do not have it in dependency, for example pandas use it in read_sql.

Taragolis avatar Jan 04 '23 15:01 Taragolis

I think we should support both 1.4 and 2.0 for at least one minor release, preferrably much longer. Dependencies don’t tend to catch up very fast for this kind of migrations, and an Airflow installation generally has a lot of those.

uranusjr avatar Jan 04 '23 15:01 uranusjr

Agree.

potiuk avatar Jan 17 '23 09:01 potiuk

I think we should support both 1.4 and 2.0 for at least one minor release, preferrably much longer. Dependencies don’t tend to catch up very fast for this kind of migrations, and an Airflow installation generally has a lot of those.

Fully agreed, we need to support both

kaxil avatar Jan 28 '23 19:01 kaxil

What can be the expected version that will bring SQLAlchemy 2.0 support. I am making a service that will use SQLAlchemy 2.0 APIs. I will also integrate Airflow with that service. Based on ETA, I can decide whether to use SQLAlchemy 2.0 or go ahead with SQLAlchemy 1.4 and not wait for the supported version of Airflow.

infohash avatar Mar 06 '23 14:03 infohash

It can be the next version if you help out on it πŸ™‚

uranusjr avatar Mar 07 '23 10:03 uranusjr

I want to contribute here

auvipy avatar Mar 11 '23 08:03 auvipy

SQLA 1.4 & 2.0 both can be supported at the same time.

auvipy avatar Mar 11 '23 09:03 auvipy

Hi everyone, Do we have some kind of board or issues list that are currently in progress? I see the work is going on, so I would like to help and fix some part that nobody else is working on at the moment? @potiuk

moiseenkov avatar Jul 25 '23 10:07 moiseenkov

@kaxil @phanikumv @uranusjr @hussein-awala Maybe you could help with coordination here? Do you have any plan for the process here? Thanks!

VladaZakharova avatar Jul 26 '23 08:07 VladaZakharova

Hi @VladaZakharova - we are currently suspending this effort till 2.7 is released. We have already completed the changes, but will keep the PR in draft mode until then. The other pending change is to fix the tests so that they use the sqlalchemy 2.0 style (but again they need to wait till 2.7 is released)

cc @jedcunningham

phanikumv avatar Jul 26 '23 10:07 phanikumv

Hi @VladaZakharova - we are currently suspending this effort till 2.7 is released. We have already completed the changes, but will keep the PR in draft mode until then. The other pending change is to fix the tests so that they use the sqlalchemy 2.0 style (but again they need to wait till 2.7 is released)

cc @jedcunningham

@phanikumv , Thank you for replying. Could you please describe the reason for this suspension? Are there any blockers that will be resolved in 2.7?

moiseenkov avatar Jul 26 '23 10:07 moiseenkov

Hey @moiseenkov since we are prepping for 2.7 release, we want to ensure that the main branch doesn't have anymore merges which might impact core airflow

phanikumv avatar Jul 26 '23 15:07 phanikumv

@phanikumv thank you for the update :) Do you know the approximate day when the new version will be released so we can continue work on this issue?

VladaZakharova avatar Jul 26 '23 15:07 VladaZakharova

Not sure of exact date - suggest to follow this thread

phanikumv avatar Jul 26 '23 15:07 phanikumv

Got it! Lets then be in touch about the updates here to continue work on this issue :)

VladaZakharova avatar Jul 26 '23 15:07 VladaZakharova

It's better to throw a warning on https://airflow.apache.org/docs/apache-airflow/stable/security/flower.html#flower so that people aware the issue when trying to add Flower component.

/home/airflow/.local/lib/python3.7/site-packages/airflow/models/base.py:49 MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
ERROR: You need to upgrade the database. Please run `airflow db upgrade`. Make sure the command is run using Airflow version 2.5.3.

vumdao avatar Jul 29 '23 18:07 vumdao

Hi everyone, what's the current state of this and is it known when we can expect SA2.0 support?

EDIT: A second look tells me flask-appbuilder, which is tightly coupled with airflow, depends on SA<1.5.0 So basically there's no way to install a package that depends on SA>=2.0 in something like a celery worker?

avramdj avatar Sep 21 '23 18:09 avramdj

Hi everyone, it seems that all deprecation warnings are suppressed already. Am I right? If so, are we ready for the upgrade? @Taragolis , @potiuk , @phanikumv

moiseenkov avatar Oct 18 '23 09:10 moiseenkov

Not yet @moiseenkov , we are close but still have couple of files which we are working on. I will keep you posted when all the codebase has been refactored !

phanikumv avatar Oct 18 '23 09:10 phanikumv

Not yet @moiseenkov , we are close but still have couple of files which we are working on. I will keep you posted when all the codebase has been refactored !

Thanks!

moiseenkov avatar Oct 18 '23 10:10 moiseenkov

Hi Team! Is there some progress on this one? Are we actually close to close it? :)

VladaZakharova avatar Dec 14 '23 10:12 VladaZakharova

I don't think that someone track progress for this one due to huge scope of the task/issue. However as I could see a lot of stuff migrated from the Query API to the statement based builder (or how it correctly named).

There is still remaining parts exists in (this might not a complete list):

  • In FAB provider: airflow/providers/fab/
  • Migration scripts: airflow/migrations/versions/
  • In selected modules:
    • airflow/dag_processing/processor.py
    • airflow/models/taskinstance.py
    • airflow/sensors/external_task.py
    • airflow/triggers/external_task.py
    • airflow/utils/db_cleanup.py
    • airflow/utils/log/file_task_handler.py

Taragolis avatar Dec 14 '23 10:12 Taragolis

Just for the record, fix all stuff do not automatically granted ability to run Airflow with SQLAlchemy 2.0, due to upper bound limitation of Airflow and Providers dependencies:

root@4c41f46e319c:/opt/airflow# pipdeptree --packages sqlalchemy -r
SQLAlchemy==1.4.50
β”œβ”€β”€ alembic==1.13.0 [requires: SQLAlchemy>=1.3.0]
β”‚   β”œβ”€β”€ apache-airflow==2.9.0.dev0 [requires: alembic>=1.6.3,<2.0]
β”‚   β”œβ”€β”€ databricks-sql-connector==2.9.3 [requires: alembic>=1.0.11,<2.0.0]
β”‚   └── sqlalchemy-spanner==1.6.2 [requires: alembic]
β”œβ”€β”€ apache-airflow==2.9.0.dev0 [requires: SQLAlchemy>=1.4.28,<2.0]
β”œβ”€β”€ databricks-sql-connector==2.9.3 [requires: SQLAlchemy>=1.3.24,<2.0.0]
β”œβ”€β”€ eralchemy2==1.3.8 [requires: SQLAlchemy>=1.4]
β”œβ”€β”€ Flask-AppBuilder==4.3.10 [requires: SQLAlchemy<1.5]
β”‚   └── apache-airflow==2.9.0.dev0 [requires: Flask-AppBuilder==4.3.10]
β”œβ”€β”€ Flask-SQLAlchemy==2.5.1 [requires: SQLAlchemy>=0.8.0]
β”‚   └── Flask-AppBuilder==4.3.10 [requires: Flask-SQLAlchemy>=2.4,<3]
β”‚       └── apache-airflow==2.9.0.dev0 [requires: Flask-AppBuilder==4.3.10]
β”œβ”€β”€ marshmallow-sqlalchemy==0.26.1 [requires: SQLAlchemy>=1.2.0]
β”‚   └── Flask-AppBuilder==4.3.10 [requires: marshmallow-sqlalchemy>=0.22.0,<0.27.0]
β”‚       └── apache-airflow==2.9.0.dev0 [requires: Flask-AppBuilder==4.3.10]
β”œβ”€β”€ snowflake-sqlalchemy==1.5.1 [requires: SQLAlchemy>=1.4.0,<2.0.0]
β”œβ”€β”€ sqlalchemy-bigquery==1.9.0 [requires: SQLAlchemy>=1.2.0,<2.0.0dev]
β”œβ”€β”€ sqlalchemy-drill==1.1.4 [requires: SQLAlchemy]
β”œβ”€β”€ SQLAlchemy-JSONField==1.0.2 [requires: SQLAlchemy]
β”‚   └── apache-airflow==2.9.0.dev0 [requires: SQLAlchemy-JSONField>=1.0]
β”œβ”€β”€ sqlalchemy-redshift==0.8.14 [requires: SQLAlchemy>=0.9.2,<2.0.0]
β”œβ”€β”€ sqlalchemy-spanner==1.6.2 [requires: SQLAlchemy>=1.1.13]
└── SQLAlchemy-Utils==0.41.1 [requires: SQLAlchemy>=1.3]
    └── Flask-AppBuilder==4.3.10 [requires: SQLAlchemy-Utils>=0.32.21,<1]
        └── apache-airflow==2.9.0.dev0 [requires: Flask-AppBuilder==4.3.10]

Also required that this packages also supports SA 2.0:

Taragolis avatar Dec 14 '23 11:12 Taragolis

This is why I forked SQLAlchemy 2.0 and put it in my project's source directory instead of installing it as a project dependency. Once airflow ecosystem completely moves to SQLA 2.0, I will just stop using my own fork. All I had to do was change the package name of SQLA 2.0. You can also fork it and rebuild SQLA 2.0 wheel with a different package name.

infohash avatar Dec 15 '23 14:12 infohash

Not a deprecation but probably related to this effort: DagBag cannot be imported with SQLA 2 (sqlite connection):

>>> from airflow.models import DagBag
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/.../.venv/lib64/python3.9/site-packages/airflow/__init__.py", line 68, in <module>
    settings.initialize()
  File "/home/.../.venv/lib64/python3.9/site-packages/airflow/settings.py", line 544, in initialize
    configure_orm()
  File "/home/.../.venv/lib64/python3.9/site-packages/airflow/settings.py", line 242, in configure_orm
    engine = create_engine(SQL_ALCHEMY_CONN, connect_args=connect_args, **engine_args, future=True)
  File "<string>", line 2, in create_engine
  File "/home/.../.venv/lib64/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
    return fn(*args, **kwargs)  # type: ignore[no-any-return]
  File "/home/.../.venv/lib64/python3.9/site-packages/sqlalchemy/engine/create.py", line 686, in create_engine
    raise TypeError(
TypeError: Invalid argument(s) 'encoding' sent to create_engine(), using configuration SQLiteDialect_pysqlite/QueuePool/Engine.  Please check that the keyword arguments are appropriate for this combination of components.

This is due to the removal of the deprecated encoding parameter for SQLA 2.0.

I suppose one way to solve this while maintaining backward compatibility is by wrapping the following in a version check:

    # Allow the user to specify an encoding for their DB otherwise default
    # to utf-8 so jobs & users with non-latin1 characters can still use us.
    engine_args["encoding"] = conf.get("database", "SQL_ENGINE_ENCODING", fallback="utf-8")

BTW, resolving the above results in the following deprecation warning:

/home/.../.venv/lib64/python3.9/site-packages/airflow/utils/orm_event_handlers.py:37 SADeprecationWarning: The `sqlalchemy.orm.mapper()` symbol is deprecated and will be removed in a future release. For the mapper-wide event target, use the 'sqlalchemy.orm.Mapper' class.

Changing the offending line as below, while keeping the import the same, seems to resolve the warning:

-event.listen(sqlalchemy.orm.mapper, "before_configured", import_all_models, once=True)
+event.listen(sqlalchemy.orm.Mapper, "before_configured", import_all_models, once=True)

Dev-iL avatar Feb 18 '24 10:02 Dev-iL

Why don you make a PR for that @Dev-iL ?

potiuk avatar Feb 18 '24 21:02 potiuk

@potiuk Sure thing, will ping when ready for review.

Dev-iL avatar Feb 18 '24 22:02 Dev-iL