sql icon indicating copy to clipboard operation
sql copied to clipboard

[FEATURE] Failure Recovery - Support session state monitor and session recovery

Open penghuo opened this issue 2 years ago • 0 comments

Requirements

  • SessionStateMonitor execute in schedule interval.
  • Hang up job detection and recovery.
    • SessionStateMonitor check lastUpdateTime.
    • if it is not updated more than 15mins (configurable). Call call EMR-S fetchJob API. * if job is running: recovery session (cancel current EMR-S job and start new EMR-S job). * if job is cancelled: set StatementState to FAIL.
  • Failed job auto recovery.
    • SessionStateMonitor detect StatementState is FAIL, and Retry count less than 3.
    • SessionStateMonitor recovery session

penghuo avatar Oct 19 '23 00:10 penghuo