Summary

This PR explicitly closes listening sockets in parent processe early in the shutdown process. That way, no new connections can hit workers after shutdown is initiated by the parent process.

While here, speedup shutdown by splitting process shutdown into terminate and join loop, so all child processes are shutdown in parallel.

Current behaviour

Here is a sample application:

import asyncio

from fastapi import FastAPI

app = FastAPI()


@app.get("/")
async def root():
    return {"Hello": "World"}


@app.on_event("shutdown")
async def shutdown_event():
    print("Shutdown event")
    await asyncio.sleep(10)
    print("Shutdown event done")


@app.get("/sleep")
async def sleep():
    """can be used to test graceful shutdown"""
    await asyncio.sleep(60)

It is ran with:

uvicorn app:app --workers 2

In a second window we can run simulated traffic generator, a curtesy of curl:

while :; do curl --connect-timeout 0.1 --max-time 1 localhost:8000; sleep 1; done

In third window we pkill(1) the parent:

pkill -f 'uvicorn app:app'

What is seen in uvicorn logs:

(fast) ➜  app uvicorn app:app --workers 2
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started parent process [87167]
INFO:     Started server process [87194]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Started server process [87195]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Normal traffic:

INFO:     127.0.0.1:63935 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63936 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63938 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63939 - "GET / HTTP/1.1" 200 OK

SIGTERM received:

INFO:     Shutting down
INFO:     Waiting for application shutdown.
Shutdown event

At this point only a single worker receives traffic:

INFO:     127.0.0.1:63940 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63942 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63943 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63944 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63945 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63946 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63947 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63948 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63949 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:63950 - "GET / HTTP/1.1" 200 OK
Shutdown event done
INFO:     Application shutdown complete.
INFO:     Finished server process [87194]
INFO:     Shutting down
INFO:     Waiting for application shutdown.

Now we closed the last workers' sockets but parent still has opened socket, so we are dropping requests on the floor:

Shutdown event
Shutdown event done
INFO:     Application shutdown complete.
INFO:     Finished server process [87195]
INFO:     Stopping parent process [87167]

After that point we are throwing connection-refused.

From the client it looks like:

$ while :; do curl --connect-timeout 0.1 --max-time 1 localhost:8000; sleep 1; done
{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}
# At this point we sent SIGTERM.  All the following responses come from a single alive worker:
{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}{"Hello":"World"}
# Here this worker dies and we start dropping requests on the floor (non-retrieable).
curl: (28) Operation timed out after 1001 milliseconds with 0 bytes received
curl: (28) Operation timed out after 1003 milliseconds with 0 bytes received
curl: (28) Operation timed out after 1004 milliseconds with 0 bytes received
curl: (28) Operation timed out after 1002 milliseconds with 0 bytes received
curl: (28) Operation timed out after 1005 milliseconds with 0 bytes received
# Here master quits and we get clean (retriable connection refused)
curl: (7) Failed to connect to localhost port 8000 after 4 ms: Couldn't connect to server
curl: (7) Failed to connect to localhost port 8000 after 4 ms: Couldn't connect to server
curl: (7) Failed to connect to localhost port 8000 after 4 ms: Couldn't connect to server

New behaviour

New version:

Closes listening socket in parent process immediately.
Shuts down all worker processes in parallel, which has two side effects:
- No load imbalance between workers during shutdown.
- Faster shutdown porcess.

PS. The same situation can be reproduced w/o shutdown lifecycle event with --timeout-graceful-shutdown 30 and an endless stream of slow requests

Checklist

[x] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
[ ] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
[ ] I've updated the documentation accordingly.

Jun 17 '23 00:06 rbtz-openai

@Kludex A friendly ping on this ^

Jun 29 '23 16:06 rbtz-openai

ref:

https://github.com/encode/uvicorn/pull/2183

Dec 22 '23 21:12 rbtz-openai

Closing this, since we are refactoring the multiprocess module on https://github.com/encode/uvicorn/pull/2183.

Apr 13 '24 12:04 Kludex

Improve shutdown process in Multiprocess mode

Summary

Current behaviour

New behaviour

Checklist