backend.ai icon indicating copy to clipboard operation
backend.ai copied to clipboard

feat: Automate force-termination of hanging sessions

Open rapsealk opened this issue 2 years ago • 3 comments

This PR resolves #603. To automate force-termination of hanging sessions, it will be processed in the order below:

  1. Query kernels
  • where status is PREPARING or TERMINATING
  • and time has passed more than given datetime.timedelta since status_history[status]
  1. Force termination and cleanup.

rapsealk avatar Aug 22 '22 05:08 rapsealk

Deploy Preview for backendai-docs-preview failed.

Name Link
Latest commit 4fd5b24f067aec2ca2cf8400ba47be2d0061196e
Latest deploy log https://app.netlify.com/sites/backendai-docs-preview/deploys/630310aea581160009130006

netlify[bot] avatar Aug 22 '22 05:08 netlify[bot]

@adrysn I also placed an additional config flag force-terminate-hanging-sessions in manager.toml. The default value is true and an user can set it false to disable the feature. https://github.com/lablup/backend.ai/blob/ccea5c63de2a9696fb378b13dda003075a800dfe/configs/manager/halfstack.toml#L16-L24 https://github.com/lablup/backend.ai/blob/ccea5c63de2a9696fb378b13dda003075a800dfe/src/ai/backend/manager/server.py#L688-L689

rapsealk avatar Aug 25 '22 03:08 rapsealk

Maybe we can add a config flag for setting an interval to sleep?

rapsealk avatar Aug 25 '22 03:08 rapsealk

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Mar 26 '23 03:03 CLAassistant

I'd like to add this feature! Please merge the latest main branch.

achimnol avatar Jun 30 '23 08:06 achimnol