airflow
airflow copied to clipboard
Add triggered_by field to DAG Run model to distinguish the source of a trigger
Add the triggered_by
field to the DAG Run model to distinguish the source of a trigger.
This is for the discussion in https://github.com/apache/airflow/pull/37087.
Currently this PR contains only the backend changes. No changes done on the frontend. We can discuss and add some UI elements (different border types, colors, or icons etc.) to have a more distinct look on the UI as well.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst
or {issue_number}.significant.rst
, in newsfragments.
Do we still need run_type
? I think it should be removed in a follow-up PR and this triggered_by is used in places where run_type is used
I don’t disagree with using triggered_by
, but am not sure about removing run_type
due to compatibility issues. And logically the two might not line up exactly…? (In the sense that many runs triggered by the same component may have different types, and vice versa.)
Side note: this PR and the triggered_by
field is a result of an attempt to add a new run_type
in #37087. When we think about the backward compatibility I think it makes sense to keep the run_type
field.
Hi @jscheffl,
Thank you for your message and your point of view. The thing is that at the beginning we were trying to address a very specific use case to distinguish operator-triggered runs and scheduled runs. While trying to find a solution, I implemented #37087 but we had a discussion on that PR and as a conclusion we decided to add the triggered_by
field to distinguish the source for the dag run. It might be handy to know which component triggered the DAG Run.
Since we can trigger a DAG run with different sources, it was making sense to store this information in the table. To be able to distinguish who started the run from the UI, or which operator/task triggered the run in the code a bit more excessive, at least at this point. Maybe adding a record to LOG/audit log table can be useful to know which specific task triggered the run or who started the run from the UI.
Hi @potiuk ! We are struggling here with making CI satisfied 😿 Can you please take a look with us? Maybe you have some ideas? Thank you!
tests/providers/google/cloud/links/test_translate.py::TestTranslationLegacyDatasetLink::test_get_link - TypeError: create_dagrun() got an unexpected keyword argument 'triggered_by'
The "create_dragrun" method in DAG model does not have the triggered_by
argument - and commons.sql and openlineage provider call that method, and this method does not have triggered_by
argument for Airlfow 2.9, 2.8, 2.7. So the tests correctly caught that the change makes tests common.sql and openlineage not compatible with Airflow < 2.10.
You need to implement the change in the way that it will also work when you have airflow 2.7 - 2.9 installed (i.e. dag model has no triggered_by
attribute. This looks like test-only issue - looks like create_dagrun is executed in tests. Simply tests for the provideers should take into account that you might run thet tests and have older versions of Airflow installed.
Various ways how to deal with it and how to run the tests with Airflow 2.7 - 2.9 are described here:
https://github.com/apache/airflow/blob/main/contributing-docs/testing/unit_tests.rst#compatibility-provider-unit-tests-against-older-airflow-releases
(and you can find a number of examples where test compatibility is implemented following those)
Hi @potiuk @Taragolis @uranusjr ! Looks like this PR is finally green :D Can we please merge it?
In the meantime - before @uranusjr responds - this one needs rebasing :)
@potiuk Maybe we can ping someone else regarding the question you asked @uranusjr ? The PR is green and fresh :)
hi @potiuk ! Can you please check these changes again? We have some questions about migration for new Airflow version. Can you please guide us here? Thanks!
I think that one requires another discussion on devlist - I believe currently we do not plan any changes in 2.11 other than making it a "bridge" release for Airflow 3 changes - so that one would be targetted for Airlfow 3, which means that likely many other changes will get implemented in the meantime
Generally speaking - I think currently the idea is that PRs like that should be targetted to Airflow 3 and (if possible and the author would like to do so) - they can be backported, but currently we cannot really back-port them because we have no 2.11 branch that we can back-port it to and this is a new feture, so we cannot really merge them in v2-10-test
.
@potiuk @ephraimbuddy do we need to do a db migration for this PR? In this PR db schema was changed. And previously for the 2.10.x version I have made it. But now for the 3.0.0 version with approach which describe here: https://github.com/apache/airflow/blob/main/contributing-docs/13_metadata_database_updates.rst I can't generate migration file. Is it relevant to generate a migration file for Airflow 3.0.0?
@potiuk @ephraimbuddy do we need to do a db migration for this PR? In this PR db schema was changed. And previously for the 2.10.x version I have made it. But now for the 3.0.0 version with approach which describe here: https://github.com/apache/airflow/blob/main/contributing-docs/13_metadata_database_updates.rst I can't generate migration file. Is it relevant to generate a migration file for Airflow 3.0.0?
Yes, a migration is needed. Else no user could upgrade to 3.0
generate migration file. Is it relevant to generate a migration file for Airflow 3.0.0?
I think the idea is that all 3.0.0 changes will start from "base" 3.0 and when we have migration tool from Airflow 2, the first thing it will do is to migrate from Airlfow 2 to Airlfow 3 and clean migrations table.
But I believe (@ephraimbuddy) when you create Airlfow 3 db from the scratch now and want to add a new migration, it should work similarly as before?
generate migration file. Is it relevant to generate a migration file for Airflow 3.0.0?
I think the idea is that all 3.0.0 changes will start from "base" 3.0 and when we have migration tool from Airflow 2, the first thing it will do is to migrate from Airlfow 2 to Airlfow 3 and clean migrations table.
But I believe (@ephraimbuddy) when you create Airlfow 3 db from the scratch now and want to add a new migration, it should work similarly as before?
Yes. It should work as before. @MaksYermak , temporarily remove the schema changes, then reset the database airflow db reset -y
, next add back those schema changes, and run autogenerate command to generate the migration for you
generate migration file. Is it relevant to generate a migration file for Airflow 3.0.0?
I think the idea is that all 3.0.0 changes will start from "base" 3.0 and when we have migration tool from Airflow 2, the first thing it will do is to migrate from Airlfow 2 to Airlfow 3 and clean migrations table. But I believe (@ephraimbuddy) when you create Airlfow 3 db from the scratch now and want to add a new migration, it should work similarly as before?
Yes. It should work as before. @MaksYermak , temporarily remove the schema changes, then reset the database
airflow db reset -y
, next add back those schema changes, and run autogenerate command to generate the migration for you
@ephraimbuddy thank you for your advice. We successfully made DB migration and now this PR is complete. @uranusjr @potiuk could you please review this PR one more time?
Hi @potiuk ! Can you please check changes again so maybe we can merge it? Everything is finally green 🎉
It looks good but I think it might interact with other Airflow 3 changes especially that @uranusjr already looked it - I think that needs his input.
Okay, then let's wait for @uranusjr approval. Thank you!
Need to resolve conflicts. Otherwise looks good to me.
Hi @uranusjr, @potiuk. It looks like we are green now. Could we merge this PR?
I am good. Waiting for @uranusjr approval and we can merge it
Needs to resolve conflict now - but @uranusjr -> are you ok with that one?
I attempted to solve conflict - but It needs more static check fixes @molcay
Hi @potiuk, Thank you for the attempt. I will have a look ASAP
Transient issue.