nomad icon indicating copy to clipboard operation
nomad copied to clipboard

Nomad server leave automatically but not permanently

Open EtienneBruines opened this issue 3 months ago • 3 comments

Proposal

Have Nomad servers (and potentially clients?) notify the leader whenever they interrupt/terminate. Something like leave_on_interrupt but then less permanent.

For follower-servers, we have to wait a while for the leader to pick up on the follower being offline.

Currently I'm not sure whether leader-servers already do this, but I'd imagine something like nomad operator step-down (doesn't exist in Nomad, but does in Vault) would be useful to have Nomad do on shutdown.

In cases where we know the server will be offline for a while (e.g. reboot), there's no need to wait for timeouts/heartbeats.

Use-cases

  • Checking server-health more reliably (e.g. when deciding "Can I, a follower node, reboot right now?`)

Attempted Solutions

  • Waiting about 5-10 seconds to see Alive change in Left.

EtienneBruines avatar Nov 25 '25 16:11 EtienneBruines

Hi @EtienneBruines! I just want to make sure I understand the scenario and goal here. In the leave_on_interrupt case today, the reason why it's expected to be permanent is because leaving causes a memberlist and Raft reconfiguration. That's costly, so we recommend against it if you know that server is coming back anyways. (Ex. you don't have an immutable infrastructure thing going where the server is entirely replaced in order to change its config.)

In your scenario, you've got a follower that's restarting or rebooting (for upgrades or whatever) and will come back. What does "leaving" mean here if not the same memberlist and Raft reconfiguration. Or in other words, sending a signal to the leader isn't hard, but what is it that you want the leader to do with this information? If you're looking to coordinate between servers to orchestrate configuration changes across the cluster (ex. host reboots for kernel upgrades), you'd probably be better off querying the host directly via the Agent Health endpoints. That way you can detect a crashed host too.

tgross avatar Dec 04 '25 14:12 tgross

Thank you for your reply, @tgross!

If you're looking to coordinate between servers to orchestrate configuration changes across the cluster (ex. host reboots for kernel upgrades), you'd probably be better off querying the host directly via the Agent Health endpoints. That way you can detect a crashed host too.

That is not a bad idea!

Or in other words, sending a signal to the leader isn't hard, but what is it that you want the leader to do with this information?

If I execute nomad server members (or its -json equivalent), it not only lists the servers but also a status (e.g. Alive or Left). Having it report a status of Offline or Lost (not quite sure what expected values are) would not be a bad thing. But I don't know whether it just always reports Alive whenever it's in the memberlist?

EtienneBruines avatar Dec 04 '25 14:12 EtienneBruines

But I don't know whether it just always reports Alive whenever it's in the memberlist?

The possible statuses you can see there are:

  • alive for a server that's joined the memberlist and making serf heartbeats
  • leaving for a server that's started to leave but this hasn't been broadcasted to everyone yet (this state should be brief)
  • left for a server that's left the cluster
  • failed for a server that's missed heartbeats
  • none for a server that's showing up in the memberlist but hasn't actually joined (I'm honestly not sure without some more research how you'd ever get into this state).

The nomad server members command is surfacing the information from serf directly without transforming it, so this feature would effectively be an overlay on that information so that we're changing "alive" to "offline" (or something like that) temporarily for the operator but not doing anything with that information in Nomad itself. Or maybe we don't even change the serf.Member.Status but add tags to the serf.Member data, that way we're using the existing broadcast of memberlist data without messing with the existing API.

This is an interesting idea... maybe there's other metadata we could allow operators to stick on there too. I'll mark this for roadmapping and further thinking.

tgross avatar Dec 04 '25 15:12 tgross