airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Raising Exception if dag is not in active state while triggering dag

Open nj7 opened this issue 1 year ago • 6 comments

While triggering dag, currently inactive dag created by a templating method (either using variable or static file) gets triggered but there subsequent task never gets triggered, hence remains in ever running state.And Since the file exists in Dag bag Dag not found exception is never thrown for such kind of dags.

Though this does not affect anything, but response of Success and generating a new runid is misleading.

Adding a new exception DagNotActive which will help user to understand if the dag is deleted.

nj7 avatar Mar 01 '24 18:03 nj7

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst) Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits. Apache Airflow is a community-driven project and together we are making it better 🚀. In case of doubts contact the developers at: Mailing List: [email protected] Slack: https://s.apache.org/airflow-slack

boring-cyborg[bot] avatar Mar 01 '24 18:03 boring-cyborg[bot]

If we do this, it needs a way to trigger and enable the DAG at the same time for e.g. the REST API. I think there were related discussions in the past about how to treat a paused DAG when it is triggered. Can you try to find some of those issues in the tracker?

uranusjr avatar Mar 05 '24 10:03 uranusjr

If we do this it must be consistent between CLI and API.

I think there were related discussions in the past about how to treat a paused DAG when it is triggered

I think this was about the trigger DAG from the UI (if it should automaticly set the DAG to active in case it's paused) cc @jscheffl I guess you would know?

eladkal avatar Mar 08 '24 17:03 eladkal

If we do this it must be consistent between CLI and API.

I think there were related discussions in the past about how to treat a paused DAG when it is triggered

I think this was about the trigger DAG from the UI (if it should automaticly set the DAG to active in case it's paused) cc @jscheffl I guess you would know?

I don't remember there was an explicit discussion or decision about this. At least in the UI we explicitly handle this. If the DAG is not enabled then the user has the option (on per default) to enable the DAG while triggering. But the user is not forced to do. (+ a bit of beauty, if the DAG is already active then the option is not displayed because not needed)

My personal opinion is that there are valid use cases that you can trigger a DAG even if it is not active. Two that are directly coming into my mind:

  • We use the API as interface to other systems and they can use the API to inject workload. But here might be administrative things going on such that we want to have the options as Ops to temporarily turn off scheduling a DAG to prevent errors or overload. But in such case I'd not like to block usage of trigger or the need to implement an upstream additional queue to buffer calls. Especially I see this as a real cool USP feature of Airflow that the DAG runs can be used as queue.
  • When testing a DAG via UI/interactively you might want to prepare test data/runs by triggering them and add the DAG runs into the queue w/o launching immediately. Once all test data is prepared (e.g. create a could of runs and test for concurrency, in our case prepare DAG runs for a small load test and wait how long processing takes)

There might be more. But I would like to prevent changing this behavior ad-hoc w/o discussion or design decision as it touches and changes the external API and behavior. If it is only about intuitiveness, then I'd refer to documentation. Would be fair to extend the docs on this.

I think it would be totally legit to extend the API to be like the UI to provide an option for the potentially three use cases like:

  1. Call it, create a DAG run and "I don't care what the state is" == like today
  2. Call it, create a DAG run and is disable then turn on DAG scheduling == like possible and default in UI
  3. Call it and raise an exception to caller to signal that no workload is accepted == as proposed in this PR

For backwards compatibility the option 1 would need to be default. And of course besides UI make API + CLI consistent :-D

P.S.: Triggering and receiving a DAG ID is totally legit. Because the run would be in the queue and is registered. To refer to it (and potentially delete if off the queue) the Run ID is needed. Run ID does not mean it is the Unix PID of a running process. Even if the DAG is active many runs might be in the queued state and have a Run ID waiting to start.

P.S. (2): If a DAG is dynamically created, disappearing... then I feel there are other design or structural problems. If DAGs are created so dynamic that you rely on something in a distributed system.... might be you need to consider a different modelling? If fr example a DAG should only be visible at certain times or receive workload only 5x8 Mo-Fr then the active scheduling is not the right way. If you want to reject workload you might need to think about permissions.

P.S. (3): Is a DAG is not active it does not mean DA runs are deleted. DAG runs with a Run ID are only deleted if the DAG is "lost" for a longer time and the DAG information is purged. But this is different than then case you disable scheduling.

jscheffl avatar Mar 09 '24 08:03 jscheffl

hmm, interesting points, totally agree with arguments. Let me work on these changes

nj7 avatar Mar 10 '24 06:03 nj7

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 25 '24 00:04 github-actions[bot]