ulfius icon indicating copy to clipboard operation
ulfius copied to clipboard

Support for graceful shutdown

Open mjh-c opened this issue 2 years ago • 2 comments

Does ulfius / microhttpd support a graceful shutdown of a service? I am thinking about the following scenario:

  • A running service receives a TERM signal or similar trigger, e.g. a shutdown API that only privileged clients can call, to terminate
  • The service catches this signal and executes a shutdown procedure with the following steps
    • Make sure that no new requests are accepted - maybe by multiple measures
      • Closing the listening socket to not accept new connections
      • Return immediately HTTP code 503 (not sure if this is the best code to indicate that no new requests are accepted)
    • Wait until all open requests are completed (including sending back the response to the clients)
    • Shutdown the service completely
    • Exit the process

The goal is to be as friendly as possible to clients and proxies:

  • If an API is completed which also could mean some data is committed to database the service salso make sure to return a response to the client
  • New requests should be rejected clearly so clients can implement some retry logic and proxies can redelegate to other instances
  • Real shutdown should be done with no running API

I haven't found any ulfius examples that cover this. Is this something that could be supported in ulfius or implemented by the application?

mjh-c avatar Oct 11 '23 07:10 mjh-c

Hello @mjh-c ,

I made a little POC to test your question here: https://gist.github.com/babelouest/685e16e26047e9d47c4ec699695817ec

In short, it's possible to make a grateful shutdown, like in the gist, which works, but is not perfect.

Basically, I use a pthread_cond_t to send a signal from the endpoint to the main() function, then when the signal is received, the instance is stopped gracefully.

The problem with this code is that when a callback_hello_world is ongoing, it's not completed, and the client receives a curl: (52) Empty reply from server.

I think I've achieved something similar in taliesin to gracefully close all websockets connections and clean the resources, see here for example.

I use a combination of a counter incremented when a client connects, the counter is decremented when the client disconnects (the counter is protected by a mutex to avoid race conditions), then when the client decreases the counter, it sends another signal. If the main() function is in shutdown mode, it checks if the client counter is 0, otherwise it waits for the second signal to check again the client counter, until it reaches 0.

To avoid new clients during shutdown, you can set another variable shared between the main() function and the callback, if this variable is 0, this means shutdown mode, so the endpoint is ended immediately with a status 503 for example.

Anyway, your request seems possible, although you have to play with a few mutexes and conditions, but I don't see why not.

Hope I helped!

babelouest avatar Oct 15 '23 22:10 babelouest

First of all, thank you for the effort. I also did some more investigation and I tried to solve it by implementing my own MHD_OPTION_NOTIFY_COMPLETED function that calls then the ulfius mhd_request_completed at the end.

Using this hook I know for sure if an API is really finished, i.e. the response was sent back to the client by the microhttpd layer. So I can wait a certain amount of time for APIs to be completed which is an important part of "graceful" behavior from client perspective.

One drawback is that I have to call ulfius_start_framework_with_mhd_options which is not really handy. To get the default MHD options and flags from ulfius, I implemented an extra function in my ulfius fork. https://github.com/mjh-c/ulfius/commit/c5ff3f741789577e89e6cbdb825f1c88b095f70e (Just to get the idea. It is against a patched 2.7.10 branch. If there is interest I can prepare a PR against master).

With this extension I don't need to copy the whole ulfius_run_mhd_daemon handling of MHD options and flags in my application code and keep it in sync with upcoming versions.

The handling of immediately sending a 503 when in shutdown phase I implemented already like you suggested. Thanks for confirming.

For the synchronization of main and MHD threads I use shared volatile variables which is not elegant. I will have a look on your pthread_cond_t changes if it makes sense to replace that.

mjh-c avatar Oct 16 '23 13:10 mjh-c