airflow
airflow copied to clipboard
DAG disappearing from Airflow in case of standalone DAG processor
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.7.3
What happened?
If DAG processor is running in a standalone mode with defined --num-runs flag
airflow dag-processor --num-runs 100
Some DAGs disappear from Airflow until DAG processor will run again.
What you think should happen instead?
DAGs shouldn't disappear never.
How to reproduce
- run Airflow 2.7.3 with standalone DAG processor (separate from scheduler) with --num-runs flag Example:
airflow dag-processor --num-runs 100
- deploy some DAGs
- after some time, ~20 mins (once DAG processor will parse all DAG files ~50), deploy another DAG (new DAG file)
- after some time observe that DAGs will start to disappear
The issue is, basically, that:
- DAG processor runs until it parses each file exactly num_runs times https://github.com/apache/airflow/blob/2872d370178ffed402ac15d2983dbeda8ed7902b/airflow/dag_processing/manager.py#L1245
- if a new DAG file will be added to the DAG bag during DAG processor running, this new file will have different number of parsing runs than others already existing there
- at some point other DAGs (those which were already in the DAG bag) will stopped to be parsed due to hitting run limit (they will be excluded from parsing loop), see "Last Run" column in the screenshot below https://github.com/apache/airflow/blob/2872d370178ffed402ac15d2983dbeda8ed7902b/airflow/dag_processing/manager.py#L1164
- DAG processor will continue to parse DAGs until each file will be parsed exactly num_runs times
- Due to the fact that DAGs are not parsed for some time, they will be removed from Airflow DAG bag
Operating System
Ubuntu
Versions of Apache Airflow Providers
No response
Deployment
Other
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
If "Other Airflow 2 version" selected, which one? 2.7.3
Is it reproduced in latest Airflow (2.9.0 at that moment)?
Is it reproduced in latest Airflow (2.9.0 at that moment)?
I do not think there was a change in this area since. But it looks plausible and likely not very difficult to fix - so I will mark it as good-first-issue to investigate and fix.
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
I didn't test it in Airflow 2.9, but the implementation of the DAG processor at this place of code should be the same, so the issue has to be there as well.
@kosteev -> maybe a chance to contribute a fix yourself ?
Hey, is it okay if I get assigned to this issue? Thanks!
I found same issue in 2.9.1 .. Had to revert back to traditional schedulers :(
Hey all! So I have been trying to reproduce the error and haven't been successful.
I've ran the DagProcessor Job standalone, and created 40+ test dag files, some of them having a timeout functionality within them.
From my understanding, this is my workflow: I run the command -> I can view all my dags on the airflow UI. -> I add a new Dag file -> wait 5 minutes -> Refresh my page and all dag files exists. If there's a timeout error, there will be an explanation in terminal and/or on UI about an invalid DAG file or a runtime error.
Can you please provide more information regarding steps to reproduce the issue? Thank you!
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
Hi facing the same issue in airflow 2.7.3 and 2.9.1 both. these logs are from 2.7.3 we are getting this in dag-processer pod logs : DAG airflow_dag1 is missing and will be deactivated.[2024-06-28T09:03:57.806+0000] {manager.py:543} INFO - DAG airflow_dag2 is missing and will be deactivated.[2024-06-28T09:03:57.809+0000] {manager.py:553} INFO - Deactivated 2 DAGs which are no longer present in file.[2024-06-28T09:03:57.816+0000] {manager.py:557} INFO - Deleted DAG airflow_dag1 in serialized_dag table[2024-06-28T09:03:57.823+0000] {manager.py:557} INFO - Deleted DAG airflow_dag2 in serialized_dag
some times we are getting this in scheduler logs: [2024-07-02T08:02:45.975+0000] {scheduler_job_runner.py:1786} INFO - Found (3) stales dags not parsed after 2024-07-02 07:52:45.971946+00:00. the dags are disappearing from airflow ui and serialized_dag table and after some time they are appearing again
could any once help me with this issue
as explained in slack - likely your DAG code is producing different outputs (different DAGs) at different times it is parsed. look for bug in your DAG code.
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
This issue has been closed because it has not received response from the issue author.
I have the problem
DAG disappearing from Airflow in case of standalone DAG processor but without configuring a custom num-runs
with 2.10.1 and only deterministic dags ( not using any dynamic config for the dag authoring )
and
airflow_dag_processing_total_parse_time is between 0.1 and 1.2s
airflow_dag_processing_file_path_queue_size is between 0 and 5
I had zero problem without the dagprocessor and 2 scheduler