tiled icon indicating copy to clipboard operation
tiled copied to clipboard

Update `/healthz` endpoint when shutting down

Open danielballan opened this issue 1 year ago • 1 comments

In the shutdoown ASGI lifecycle hook...

https://github.com/bluesky/tiled/blob/7f7329de1b4ab39f502075656102585cdcc35f7c/tiled/server/app.py#L668

...

We need the app to respond to SIGTERM but updating some state, maybe on app.state, that the /healthz endpoitnt can reference. This should cause /healthz to return 500 status code because that is what k8s (and other standard tools) actually check. The body (JSON...) is for humans.

The app needs to wait long enough for HAproxy (or k8s or whatever) to poll /healthz. I think we'll need this to be configurable somehow because for dev and small deployments we do not want to wait ~10 for the app to shutdown after receiving SIGTERM.

danielballan avatar Sep 04 '24 18:09 danielballan

Sequence of events:

  1. Application receives SIGTERM (or SIGINT if it is being running in a terminal and stopped with ^C).
  2. Application updates /healthz.
  3. Application waits for HAproxy to poll.
  4. Application stops accepting new connections.
  5. Application waits for requests in progress to finish.
  6. Applications terminates.
  7. Process exits.

danielballan avatar Sep 04 '24 18:09 danielballan

We concluded that this is technically viable but not the best approach.

danielballan avatar Oct 04 '24 12:10 danielballan