hive
hive copied to clipboard
HIVE-28093: Re-execute DAG in case of NoCurrentDAGException
What changes were proposed in this pull request?
Related to TEZ-4543, this is to rerun DAG if the client faces a DAG_FAILED due to NoCurrentDAGException in the AM.
Why are the changes needed?
TEZ-4543 takes care of returning quite fast if a restarted AM doesn't run the queried DAG.
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
No.
How was this patch tested?
Tested on cluster, and unit tests for AM + Hive already added.
This is logged when dag_1708961199044_0002_1 failed earlier, and as I kept injected OOM into an AM (making it crash in a k8s environment), dag_1708961199044_0003_1 is failed again.
hiveserver2 <14>1 2024-02-26T16:00:37.730Z hiveserver2-0 hiveserver2 1 dedef3f4-339f-4ba3-a6ae-300751d3561d [mdc@18060 class="reexec.ReExecuteLostAMQueryPlugin" dagId="dag_1708961199044_0003_1" level="INFO" operationLogLevel="EXECUTION" queryId="hive_20240226155836_6b1e9eb9-efd7-42fd-8872-f4189c5dda3a" sessionId="9e4cb344-ad7f-4344-9b24-aedaf0e73bf4" thread="HiveServer2-Background-Pool: Thread-129"] Got exception message: No running DAG at present retryPossible: true, dags seen so far: [dag_1708961199044_0002_1, dag_1708961199044_0003_1]
Minor Stuff, else looks good
thanks a lot, addressed your comments
Quality Gate passed
Issues
2 New issues
0 Accepted issues
Measures
1 Security Hotspot
No data about Coverage
No data about Duplication