airflow
airflow copied to clipboard
fix: sqa deprecations for airflow tests
related: #28723
fix deprecations for SQLAlchemy 2.0 for Airflow tests.
@Taragolis I went through all test warnings. IMO for the majority of SQA 2.0 warnings there isn't something to do. That is often the case with the warnngs 'object is being merged into a Session along the backref cascade... ' SQA is warning the a object is pulled into the session with a backref cascade and not explicit. But often we compensate for that by adding the object a few lines later in the session which should be fine. In such cases we still get a warning but there won't be an issue later. Here it is also explained.
All tests that raise warnings but should be fine: tests/providers/fab/auth_manager/api_endpoints/test_user_schema.py:63 tests/jobs/test_backfill_job.py:1912 tests/models/test_dagrun.py:2090 tests/models/test_taskinstance.py:1487 tests/ti_deps/deps/test_trigger_rule_dep.py:140 tests/api_connexion/endpoints/test_task_instance_endpoint.py:190
Fixed but still raise a warning tests/serialization/test_pydantic_models.py:162 tests/serialization/test_pydantic_models.py:163
Fixed reported in tests
- [x] tests/utils/test_log_handlers.py:475
- [x] tests/api_experimental/common/test_mark_tasks.py:137
- [x] tests/api_connexion/schemas/test_task_instance_schema.py:69
- [x] tests/api_connexion/schemas/test_task_instance_schema.py:113
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.
Fixed but still raise a warning tests/serialization/test_pydantic_models.py:162 tests/serialization/test_pydantic_models.py:163
Could you provide what kind of error you tried to fix?
Fixed but still raise a warning tests/serialization/test_pydantic_models.py:162 tests/serialization/test_pydantic_models.py:163
Could you provide what kind of error you tried to fix?
Both raise the following warnings
{"category": "sqlalchemy.exc.RemovedIn20Warning", "message": "\"DagScheduleDatasetReference\" object is being merged into a Session along the backref cascade path for relationship \"DatasetModel.consuming_dags\"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)", "filename": "tests/serialization/test_pydantic_models.py", "lineno": 162, "when": "runtest", "node_id": "tests/serialization/test_pydantic_models.py::test_serializing_pydantic_dataset_event", "group": "tests", "count": 1}
{"category": "sqlalchemy.exc.RemovedIn20Warning", "message": "\"TaskOutletDatasetReference\" object is being merged into a Session along the backref cascade path for relationship \"DatasetModel.producing_tasks
We can fix it by first intializing the object, then adding it into the Session and then setting the dataset which is where the backref is happening.
@pytest.mark.skipif(not _ENABLE_AIP_44, reason="AIP-44 is disabled")
def test_serializing_pydantic_dataset_event(session, create_task_instance, create_dummy_dag):
...
dag_ds_ref = DagScheduleDatasetReference(dag_id=dag.dag_id)
session.add(dag_ds_ref)
dag_ds_ref.dataset = ds1
task_ds_ref = TaskOutletDatasetReference(task_id=task1.task_id, dag_id=dag.dag_id)
session.add(task_ds_ref)
task_ds_ref.dataset = ds1
Yet IMO the code would looks less readable than intializing the whole object first and add it into the Session.
@pytest.mark.skipif(not _ENABLE_AIP_44, reason="AIP-44 is disabled")
def test_serializing_pydantic_dataset_event(session, create_task_instance, create_dummy_dag):
...
dag_ds_ref = DagScheduleDatasetReference(dag_id=dag.dag_id, dataset=ds1)
task_ds_ref = TaskOutletDatasetReference(task_id=task1.task_id, dag_id=dag.dag_id, dataset=ds1)
session.add_all([dag_ds_ref, task_ds_ref])
Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior
I guess it points to change in models relationships, maybe it is simmilar: https://github.com/sqlalchemy/sqlalchemy/issues/6148
Oh interesting. Due to your comment I looked again in the SQL Alchemy docs
I thought we have enabled already SQA 2.0 style by setting future=True in the engine here but it seems that this enables 2.0 style of engines and connections only. Yet for the ORM, to anble future behavior in the ORM Sessions we also need to pass future=True in the sessionmaker object.
Session = scoped_session(
sessionmaker(
autocommit=False,
autoflush=False,
bind=engine,
expire_on_commit=False,
future=True
)
)
If I do the warnings disapear. If I remove my fixes the tests fail. So it seems that is pretty usefull to test if the changes / fixes work and it might makes in general sense to set future=True also for the ORM session ?
There is one small things, according to the docs it is recommended switch to Future=True after we resolve all RemovedIn20 warnings. https://docs.sqlalchemy.org/en/20/changelog/migration_20.html#migration-to-2-0-step-five-use-the-future-flag-on-session
So maybe we need to start replace legacy backrefs by back_populates: https://docs.sqlalchemy.org/en/14/orm/relationships.html because this warning related only to backref
Yes legacy backrefs should be replaced anyway by back_populates and explicit relationships.
Yet we will still see those warnings as also with back_populates and an explicit relationship SQLAlchlemy 1.4 still uses the old way of merging an object into the Session along the cascade path.
Just to make sure, we can fix all object is being merged into a Session along the backref cascade path with the current setup (NOT setting future=True in the Session object) when we
- create the object first,
- adding it then to the session and
- adding afterwards the relationship values where backref/back_populates is happening.
Similar to here. IMO it will still be less readable and unstandable. It is also just a warning that does not lead to false behavior when handled correctly. And you can confirm it by using future=True in the Session
WDYT? Shall I change it in a way that all warnings are fixed or shall we acknowledge some warnings as False positives ?
I would rather fix one by one rather than apply fix which might hide this
We still have third party components, which strictly depends on SA14 and might not work correctly with new sessions
Particular this one described in https://docs.sqlalchemy.org/en/14/orm/cascades.html#save-update
And there is description about cascade_backrefs which is True by default in 1.4, and False in 2.0
Ok. Changed.
All RemovedIn20Warning should be fixed.