nomad icon indicating copy to clipboard operation
nomad copied to clipboard

Stop system job after disconnect with server for some time

Open Jamesits opened this issue 3 months ago • 1 comments

Proposal

Implement stop_after_client_disconnect for system jobs.

Use-cases

I use Nomad to start a health check responder agent on a set of Azure VMSS clients, so Azure can detect if Nomad client and has been started and the driver is working. But when Nomad client fail to connect to the server afterwards, there is no way to kill the health check responder agent automatically, resulting in the auto recovery mechanism unable to kick in.

Attempted Solutions

Check for Nomad client errors in the health check program. However the client might not expose a HTTP API making this impossible to implement.

maybe related: https://github.com/hashicorp/nomad/pull/19886

Jamesits avatar Apr 03 '24 09:04 Jamesits

Hi @Jamesits and thanks for raising this feature request. We think this makes sense and would be a good thing to support. I'll get it added to our backlog.

jrasell avatar Apr 04 '24 12:04 jrasell