airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Clearing a running dag run does not reset the queued_at timestamp

Open ferruzzi opened this issue 3 weeks ago • 7 comments

Apache Airflow version

3.1.3

If "Other Airflow 2/3 version" selected, which one?

No response

What happened?

When you clear a dag run, it changes the state to QUEUED which should set the queued_at timestamp to the newly-re-queued time, but for some reason it isn't. I submitted https://github.com/apache/airflow/pull/59066 but that only fixed it when the run is in SUCCESS or FAILURE state, if you clear a RUNNING run, it still doesn't work. The discussion in that PR also implies that there might be something deeper going on with SQLAlchemy which my "fix" may possibly just be plastering over instead of actually fixing, and that may be worth a look.

I might get to this one, but I am trying to wrap a few other things up before the holidays, so if someone gets to it before me, that would be great.

What you think should happen instead?

When you clear a dag run, it changes the state to QUEUED which should set the queued_at timestamp to the newly-re-queued time

How to reproduce

I tested this by running a dag like this:

with DAG(dag_id="really_long_dag"):
    BashOperator(task_id='sleep_task', bash_command='sleep 10000')

Run the dag and check the queued_at time in the database. You can use psql in the Breeze environment or perhaps your IDE has a database monitoring connection; I know PyCharm does.

Clear the run (I used the UI) and the run state will flash though QUEUED and to RUNNING, but the queued_at timestamp will remain unchanged.

Wait for the dag to finish or force it to a terminal state and clear it again. You'll see it flash through the state change again, but this time the queued_at time in the db will have updated.

Operating System

linux

Versions of Apache Airflow Providers

No response

Deployment

Other

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

ferruzzi avatar Dec 08 '25 21:12 ferruzzi

Can I work on this?

henry3260 avatar Dec 09 '25 02:12 henry3260

I tried debugging this. The docstring for the method has a note on this. The start_date, clear_number and queued_at are updated only for dagruns in finished state with dr.state in State.finished_dr_states being false for a running dagrun.

https://github.com/apache/airflow/blob/7218cf045812a5a7135d7f119cdde0aed817b132/airflow-core/src/airflow/models/taskinstance.py#L200-L206

https://github.com/apache/airflow/blob/7218cf045812a5a7135d7f119cdde0aed817b132/airflow-core/src/airflow/models/taskinstance.py#L281-L307

tirkarthi avatar Dec 09 '25 13:12 tirkarthi

Clearing a dagrun in running state doesn't move it to queued to be moved to running state by scheduler. The dagrun remains in running state. Related commit that made this change in Airflow 2.7.0 before which clearing will always set the dagrun state to queued. Perhaps for deadlines use case the dr.queued_at could be updated irrespective of the dagrun's current state during clearing but that's still semantically little different because dagrun never moves to queued state to update queued_at and then to later become running.

Ref commit : https://github.com/apache/airflow/commit/070ecbd87c5ac067418b2814f554555da0a4f30c

tirkarthi avatar Dec 09 '25 13:12 tirkarthi

Hm, I wonder if that's still the desired behavior. I can do an easy deadline-specific workaround by adding a check before the if dr.state in State.finished_dr_states: but I wonder if it's something we want to change. What are the advantages of this way instead of just always flagging it QUEUED? Are we saying that if it's currently running when a user clears it, that it just restarts instead of getting tossed back into the queue? That doesn't seem like the expected behavior, but maybe I'm wrong.

ferruzzi avatar Dec 09 '25 18:12 ferruzzi

I suspect this change was made for performance optimizations or to simplify the state transitions. That said, I agree the current behavior feels unintuitive.

henry3260 avatar Dec 09 '25 18:12 henry3260

Reading through the original Issue that spawned that change, the request was that we should not change the start_date on a clear which I agree with, but I feel like the queued_at time should reset

edit: confusing phrasing

ferruzzi avatar Dec 09 '25 18:12 ferruzzi

Started an email thread on the dev list to get community thoughts. My proposal is that we should always kick the run back to the queue and update the queued_at regardless of the current state.

ferruzzi avatar Dec 10 '25 20:12 ferruzzi