airflow
airflow copied to clipboard
DAG run list: URL in run_id column does not work
Apache Airflow version
2.9.1
If "Other Airflow 2 version" selected, which one?
No response
What happened?
In the DAG run view, the URL's of the run_id column don't work.
For example, I have a DAG that ran multiple times. The link in the dag_id column works as expected:
http://localhost:8080/dags/test-task/graph?execution_date=2024-05-14+11%3A39%3A20.673554%2B00%3A00
(It adds &dag_run_id=run_1 to the URL, but that is indeed correct.)
For the same DAG run the link in the run_id column is this: http://localhost:8080/dags/test-task/graph?dag_run_id=run_1
When clicking this it shows me the latest run of the DAG, and the URL is also changed to http://localhost:8080/dags/test-task/grid?dag_run_id=run_29&tab=graph
What you think should happen instead?
The link in the run_id column should take the user to the correct DAG run, because RunId's are unique identifiers of runs.
I'm not entirely sure what the link in the dag_id column should be. I'm fine with that staying as it is now, although I'm not sure what the value is of having the same link twice.
How to reproduce
- Go to a DAG that ran more than once
- For a DAG run that wasn't the latest for that DAG, click on the link in the
run_idcolumn. This will still take you to the latest run.
Previously there have been issues with selecting the correct DAG run if they were not in the last 25 runs. When doing a fix this should be verified to work.
This file creates two DAGs, one to trigger the other one 29 times. (Should be larger than your default_dag_run_display_number, which is by default 25.)
from airflow import DAG
from airflow.api.common.trigger_dag import trigger_dag
from airflow.operators.python import PythonOperator
def trigger_dags():
for i in range(1, 30):
trigger_dag(
dag_id='test-task',
run_id=f'run_{i}',
conf={ 'run': i},
replace_microseconds=False
)
with DAG(dag_id="test-trigger", schedule=None) as dag:
task = PythonOperator(
task_id='trigger-dags',
python_callable=trigger_dags
)
with DAG(dag_id='test-task', schedule=None) as dag2:
dummy_task = PythonOperator(
task_id='dummy-task',
python_callable=lambda params: print(f'this is run nr {params.get("run")}')
)
Operating System
Debian bookwork
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
Using the official helm chart. I see the problem both on my local k8s cluster (on my mac) and in our EKS cluster.
Anything else?
I've tested it with two different browsers (Brave and Safari), same results.
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
I wonder if the link in the DAG ID column should also take users to the DAG run associated with that row...
This also appears broken for the Task Instance view.
I wonder if the link in the DAG ID column should also take users to the DAG run associated with that row...
That would be the current behaviour for the "List DAG Run" view. But for the "List Task Instance" view the Dag Id column just takes you to the main page for that DAG (e.g. http://localhost:8080/dags/test-task/graph). I don't have a strong preference for either of them, but I'm in favour of consistency.
I hadn't taken a look at the Task Instance view. The Run Id column indeed uses the same link as on the DAG Run page, so that is broken for exactly the same reasons. So if the URL needs to change that needs to be applied to both pages. But the link in the Task Id column suffers again from the "within last 25 runs" problem. So in my example with 29 runs that link works for run 5 to 29, but for run 1 to 4 it isn't able to select the right DAG/Task and shows a broken grid view page. So that needs fixing too.
This sounds like same issue as https://github.com/apache/airflow/issues/39642.
Closing as a duplicate of #39642