nomad icon indicating copy to clipboard operation
nomad copied to clipboard

(pre-)Stop/Kill action/command

Open sofixa opened this issue 5 years ago • 10 comments

Currently Nomad supports defining a kill signal, and it'd be pretty useful to be able to define pre- and stop/kill actions/commands ( we can already do post-stop via tasks with lifecycle > hook > poststop).

The main use case i see for this is shutting down complex software/tasks that needs actions performed on it for a graceful shutdown, e.g. ScyllaDB recommend running a command (nodetool drain) and then shutting down gracefully before killing the Docker container.

It could also be useful in order to do more graceful drains, for example when doing rolling upgrades (e.g. failing the healthcheck to make the instance inaccessible from Consul/LB before actually shutting it down).

In theory it could be achieved with an additional hook (prestop), but that might cause some issues ( e.g. in ScyllaDB's case, the prestop task would need to contain all the tools and configuration to be able to run commands on the ScyllaDB running in the main task; and it won't work for the specific case, since they recommend shutting down gracefully via supervisord after draining, and i don't think one can call supervisorctl remotely).

Adrian

sofixa avatar Jan 22 '21 11:01 sofixa

Hi!

We would love to have the ability to configure a pre-stop script. It will help us implement smoother upgrades that require load-balancer reconfiguration. Right now our load-balancer coupled with consul services and consul-template will detect that a service is no longer running on a node and will forward traffic to a different node. But this happens after a small delay and only after the job has been terminated.

If we had a pre-stop script we can switch the traffic on the load-balancer before the job started to die. This way we won't have to wait for consul to detect and propagate changes. Once it's back online we will use the poststart hook to reconfigure the load-balancer to use the local service again.

Also, the prestop hook is the only one missing in the family: prestart, poststart,poststop are there. Personally, I would add it for the sake of symmetry.

  • Alex

shcherbachev avatar Sep 21 '21 16:09 shcherbachev

This is impacting our ability to kick off connection draining for our HAProxy containers running in Nomad - similar to @shcherbachev's use-case. I'll take a look at the code for post-stop and pre-start to get an idea of how pre-stop might work. Stay tuned!

mikeblum avatar Dec 08 '21 01:12 mikeblum

Hi @tgross

Forked and setup a Nomad dev environment (very smooth on-boarding. The contrib guide was excellent). After reviewing how pre-start and the other lifecycle hooks are implemented I have a few questions on the scope of pre-stop:

For reference here are the docs for lifecycle hooks: https://www.nomadproject.io/docs/job-specification/lifecycle#lifecycle-stanza

blog: https://www.hashicorp.com/blog/hashicorp-nomad-task-dependencies

1. Should we support pre-stop for sidecar tasks?

This section of the structs code points to sidecar support for pre-start. If we implemented pre-stop for a sidecar would we expect this to block stopping the parent task? Or would this be considered a non-blocking optional failure such that a pre-stop task with sidecar enabled:

based off of https://www.nomadproject.io/docs/job-specification/lifecycle#init-task-pattern

  task "halt-telemetry" {
    lifecycle {
      hook = "prestop"
      sidecar = true
    }

    driver = "exec"
    config {
      command = "sh"
      args = ["-c", "while nc -z telemetry.service.local.consul 8080; do sleep 1; done"]
    }
  }

  task "main-app" {
    ...
  }

image

A use case I could think of would be making sure any buffered logs or other crucial data has been shipped off-box to the telemetry service of choice.

2. Are there any UI components we need to update?

Pre-start / Post-stop task hooks have this UX which is quite nice when there are several lifecycle tasks.

image

Could this PR be just encompass the Go side changes?

3. How should task kill timeouts be handled?

Example from nomad job init:

# Controls the timeout between signalling a task it will be killed
# and killing the task. If not set a default is used.
kill_timeout = "20s"

In the example.nomad the kill_timeout applies to the main task - I imagine we'll want to support this for pre-stop just like it works for post-stop today but I'm wondering if there are weird implications to having a kill_timeout on the main and/or pre-stop tasks - who wins?

Related issues:

Task Lifecycle PostStart Hook: https://github.com/hashicorp/nomad/issues/8366

I'll keep digging into the code but figured I'd pose these higher level Qs to get the :thinking: going.

mikeblum avatar Dec 12 '21 18:12 mikeblum

Our use case is exactly the same with shcherbachev , is there any progress on this?

liemlhdbeatvn avatar Aug 14 '22 15:08 liemlhdbeatvn

Hi @liemlhdbeatvn and others on this issue; this is unfortunately not currently on our near-term roadmap. The team will provide updates as soon as there are any.

jrasell avatar Aug 15 '22 09:08 jrasell

I just wanted to drop in and say a prestop feature would be very useful for my use case as well.

Due to the architecture of the system I'm working on, it takes about 10 minutes for traffic to stop flowing to a task once it's removed from our load balancer. It would be great if I could have a prestop job that removes it from the load balancer, then sleeps for 10 minutes before allowing the main task to be stopped.

ljb2of3 avatar Aug 24 '22 14:08 ljb2of3

Of course, as I continue reading the docs... it appears that shutdown_delay will actually meet my needs. @shcherbachev and @liemlhdbeatvn would this work for you as well?

https://www.nomadproject.io/docs/job-specification/group#shutdown_delay

With that in mind, I'd still vote that prestop be added for completeness.

ljb2of3 avatar Aug 24 '22 14:08 ljb2of3

Hello, First - thank you for great product. I am adopting it for my use case of micro-service based warehouse management system. I wanted to add another voice for this feature. My use case is: I have stateful server-client interactions (dialogs with hand-held devices) which I am organizing with sticky sessions. At time of rolling upgrade, I want to gracefully "transition" these stateful sessions from node that's shutting down, to a new one. This involves warning the user to quickly finish his tasks, waiting for him to do that (i.e. reach parts of code that are safe from business point of view to kill user's session), and then moving the session by way of asking client to forget sticky cookie, etc.

It is a complicated song-and-dance. So far I've run into two problems:

  1. Nomad cancels consul registration when kill signal is dispatched - that's too soon for me
  2. In windows/Java I can't catch Ctrl-Break signal, and nomad isn't respecting kill_signal in windows -there is ticket for that).

So far I am considering all kinds of crutches to go around the problems above. Instead, these can be solved cleanly if I could tell my app through a pre-stop script that it is time to shutdown. It would interact with users, deal with Consul appropriately, etc.

Thank you, Alex

aparfeno avatar Jul 11 '23 10:07 aparfeno

My use case for this is very easy. I would like to issue the same API calls through curl to gracefully shutdown loki.

Reason: I'm using the ephemeral disk to store the index, cache, wal, etc., on a very fast NVMe of the server. Calling these endpoints will flush all the log chunks to the S3 storage, before the container shutdown, and in the case there is an error on migrating the ephemeral disk to another node I do not loose any logging.

POST /flush POST /ingester/prepare_shutdown POST /ingester/shutdown

gjrtimmer avatar Oct 25 '24 17:10 gjrtimmer

@jrasell Any update on this?

gjrtimmer avatar Oct 25 '24 17:10 gjrtimmer

We have the exact same rolling upgrade scenario, where we want to drain connections on HAProxy servers that are about to be shut down.

madsboddum avatar Mar 14 '25 22:03 madsboddum

We have had luck getting a similar outcome with the following workaround.

You basically put in a dummy script that waits forever until it gets a kill signal. That script is immediately followed up by another script that performs your failover steps. This pairing runs as your primary task along with your other task(s).

It eats some small amount of resources the whole time and you may have to duplicate some variable / template assignments in the additional task, but it does get the job done.

Example

We have 3 Redis jobs to form a cluster. Each job is made up of 3 tasks:

  1. Redis server
  2. Sentinel server
  3. Sentinel server shutter downer (the workaround task) (primary task)

Task 3 has a config that looks like this:

config {
    image = "redis:someversion"

    command = "/bin/bash"
    args = ["/scripts/waitForKill.sh", "/scripts/failoverNode.sh"]
}

where waitForKill.sh looks like:

nextCommand=$1

sleep infinity & PID=$!

handle_shutdown() {
    echo "Caught request to shutdown."
    kill $PID
}

trap handle_shutdown SIGTERM
trap handle_shutdown SIGINT

echo "Waiting for kill signal..."

wait

exec /bin/bash $nextCommand

The other two tasks have a shutdown_delay of "5s".

When the allocation is being shut down, the failoverNode.sh script is ran. It just checks to see if the current allocation is the master. If it is, it asks Sentinel to start a failover. The Sentinel and Redis services then have several seconds to get that worked out before they too are shut down which is plenty of time.

Having a task that is just waiting around for shutdown isn't ideal though.

sluebbert avatar Mar 18 '25 19:03 sluebbert