airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Duplicate entries in API response when TaskInstanceHistory and TaskInstance have same maximum try number

Open tirkarthi opened this issue 1 year ago • 2 comments
trafficstars

Apache Airflow version

main (development)

If "Other Airflow 2 version" selected, which one?

No response

What happened?

While trying out a task with high number of retries I noticed the issue where there are duplicate entries for task tries sometimes but eventually resolves it by itself. I noticed the following query where TaskInstanceHistory and TaskInstance entry is combined. There could be a case where the max try_number of TaskInstanceHistory entries and TaskInstance's try_number are the same thus leading to the duplicate entries in the latest try.

https://github.com/apache/airflow/blob/79db243d03cc4406290597ad400ab0f514975c79/airflow/api_connexion/endpoints/task_instance_endpoint.py#L863-L872

What you think should happen instead?

No response

How to reproduce

  1. Setup a dag with high number of retries.
  2. Notice occassionally the below scenario during API calls with duplicate response for the last try number.

image

Operating System

Ubuntu

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

tirkarthi avatar Aug 26 '24 17:08 tirkarthi

cc: @ephraimbuddy @bbovenzi

tirkarthi avatar Aug 26 '24 17:08 tirkarthi

Ahh this makes sense. I think when try_number is the same, we should only send the TI entry. and ignore the TIH entry.

bbovenzi avatar Sep 04 '24 15:09 bbovenzi

Both 2.10.2 and 2.10.3 have this issue. And you don't need a high number of retries. As long as you have retries != 0, you'll see duplicated entries

2024-11-08_14-29_2

2.10.2: https://github.com/apache/airflow/blob/35087d7d10714130cc3e9e9730e34b07fc56938d/airflow/api_connexion/endpoints/task_instance_endpoint.py#L833-L842

2.10.3: https://github.com/apache/airflow/blob/c99887ec11ce3e1a43f2794fcf36d27555140f00/airflow/api_connexion/endpoints/task_instance_endpoint.py#L834-L843

zachliu avatar Nov 08 '24 19:11 zachliu

Yeah I saw that one too during the Man's Hackathon.

potiuk avatar Nov 11 '24 14:11 potiuk