Fix uvicorn gunicorn worker class restarts when USR1 is issued
Copied from my original post https://github.com/encode/uvicorn/issues/1559#issuecomment-1179776669
Steps to reproduce
Minimal Fastapi code
import logging
import signal
import time
from logging import handlers
from fastapi import FastAPI
from starlette.responses import PlainTextResponse
logging.basicConfig(level=logging.DEBUG, filename='./log.txt')
logger = logging.getLogger()
log_handler = logging.handlers.WatchedFileHandler('./log.txt')
logger.addHandler(log_handler)
app = FastAPI(title="uvicorn-test")
logger = logging.getLogger(__name__)
@app.on_event("startup")
async def on_startup():
# signal.signal(signal.SIGUSR1, signal.SIG_IGN)
ctime = time.ctime()
logger.debug(f"uvicorn-test start up {ctime}")
@app.on_event("shutdown")
async def on_shutdown():
ctime = time.ctime()
logger.debug(f"uvicorn-test shutdown {ctime}")
@app.get("/health_check", response_class=PlainTextResponse)
def health_check():
ctime = time.ctime()
logger.debug(f"logging test {ctime}")
Run with gunicorn
gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app
Set up logging
while [ 0 ]; do
curl 'http://localhost:8000/health_check'
echo
sleep 1
done
Set up logrotate
/etc/logrotate.d/uvicorn-test
{file_location.txt}
{
daily
dateext
rotate 30
missingok
notifempty
create 0644 user group
compress
delaycompress
postrotate
kill -USR1 $(cat /tmp/log/gunicorn.pid)
endscript
}
Send USR1 signal
kill -USR1 $(cat /tmp/log/gunicorn.pid)
or force logrotate
sudo logrotate --force /etc/logrotate.d/uvicorn-test
We can see the process is terminated:
~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app
[2022-07-11 02:41:14 +0900] [22563] [INFO] Starting gunicorn 20.1.0
[2022-07-11 02:41:14 +0900] [22563] [INFO] Listening at: http://127.0.0.1:8000 (22563)
[2022-07-11 02:41:14 +0900] [22563] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2022-07-11 02:41:14 +0900] [22565] [INFO] Booting worker with pid: 22565
[2022-07-11 02:41:15 +0900] [22565] [INFO] Started server process [22565]
[2022-07-11 02:41:15 +0900] [22565] [INFO] Waiting for application startup.
[2022-07-11 02:41:15 +0900] [22565] [INFO] Application startup complete.
[2022-07-11 02:45:32 +0900] [22563] [INFO] Handling signal: usr1
[2022-07-11 02:45:32 +0900] [22563] [WARNING] Worker with pid 22565 was terminated due to signal 30 <<-
[2022-07-11 02:45:32 +0900] [23506] [INFO] Booting worker with pid: 23506
[2022-07-11 02:45:32 +0900] [23506] [INFO] Started server process [23506]
[2022-07-11 02:45:32 +0900] [23506] [INFO] Waiting for application startup.
[2022-07-11 02:45:32 +0900] [23506] [INFO] Application startup complete.
Proposal: Change uvicorn/worker.py to the following code:
def init_signals(self) -> None:
# Reset signals so Gunicorn doesn't swallow subprocess return codes
# other signals are set up by Server.install_signal_handlers()
# See: https://github.com/encode/uvicorn/issues/894
for s in self.SIGNALS:
signal.signal(s, signal.SIG_DFL)
# Remove me if SIGUSR1 is handled in server.py
signal.signal(signal.SIGUSR1, self.handle_usr1)
# Don't let SIGUSR1 disturb active requests by interrupting system calls
signal.siginterrupt(signal.SIGUSR1, False)
Re-run test
~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app
[2022-07-11 02:46:42 +0900] [24005] [INFO] Starting gunicorn 20.1.0
[2022-07-11 02:46:42 +0900] [24005] [INFO] Listening at: http://127.0.0.1:8000 (24005)
[2022-07-11 02:46:42 +0900] [24005] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2022-07-11 02:46:42 +0900] [24006] [INFO] Booting worker with pid: 24006
[2022-07-11 02:46:42 +0900] [24006] [INFO] Started server process [24006]
[2022-07-11 02:46:42 +0900] [24006] [INFO] Waiting for application startup.
[2022-07-11 02:46:42 +0900] [24006] [INFO] Application startup complete.
[2022-07-11 02:46:46 +0900] [24005] [INFO] Handling signal: usr1
Process is no longer terminated, logrotates fine and log.txt is still appended after the file rotation.

tail -F log.txt
logging test Mon Jul 11 02:46:42 2022
logging test Mon Jul 11 02:46:43 2022
logging test Mon Jul 11 02:46:44 2022
logging test Mon Jul 11 02:46:45 2022
logging test Mon Jul 11 02:46:46 2022 <<- SIGUSR1
logging test Mon Jul 11 02:46:47 2022
logging test Mon Jul 11 02:46:48 2022
Clearly this change only impacts Uvicorn's gunicorn worker class. i.e. kill -USR1 pid after running it with uvicorn only
(uvicorn --host 0.0.0.0 --port 8000 main:app), the worker is still terminated even with the work above.
But USR1: Reopen the log files behaviour is only documented on Gunicorn and undocumented on Uvicorn master (yet) so this could be expected behaviour(?). Otherwise we would need to create an async version of handle_usr1 method in server.py.
Btw, we can't simply fully copy gunicorn's init_signals method as suggested by yinkh in https://github.com/encode/uvicorn/issues/896#issuecomment-762624434
As the below signals are not compatible with asyncio.
signal.signal(signal.SIGQUIT, self.handle_quit)
signal.signal(signal.SIGWINCH, self.handle_winch)
signal.signal(signal.SIGABRT, self.handle_abort)
We would end up with errors like below.
~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker main:app
[2022-07-10 23:49:12 +0900] [63819] [INFO] Starting gunicorn 20.1.0
[2022-07-10 23:49:12 +0900] [63819] [INFO] Listening at: http://127.0.0.1:8000 (63819)
[2022-07-10 23:49:12 +0900] [63819] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2022-07-10 23:49:12 +0900] [63820] [INFO] Booting worker with pid: 63820
[2022-07-10 23:49:12 +0900] [63820] [INFO] Started server process [63820]
[2022-07-10 23:49:12 +0900] [63820] [INFO] Waiting for application startup.
[2022-07-10 23:49:12 +0900] [63820] [INFO] Application startup complete.
[2022-07-10 23:49:59 +0900] [63820] [ERROR] Traceback (most recent call last):
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
self.run_forever()
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
self._run_once()
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 1845, in _run_once
event_list = self._selector.select(timeout)
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/selectors.py", line 562, in select
kev_list = self._selector.control(None, max_ev, timeout)
File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/gunicorn/workers/base.py", line 198, in handle_quit
sys.exit(0)
SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/starlette/routing.py", line 638, in lifespan
await receive()
File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/uvicorn/lifespan/on.py", line 137, in receive
return await self.receive_queue.get()
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/queues.py", line 159, in get
await getter
asyncio.exceptions.CancelledError
[2022-07-10 23:49:59 +0900] [63820] [INFO] Worker exiting (pid: 63820)
[2022-07-10 23:49:59 +0900] [64039] [INFO] Booting worker with pid: 64039
[2022-07-10 23:49:59 +0900] [64039] [INFO] Started server process [64039]
[2022-07-10 23:49:59 +0900] [64039] [INFO] Waiting for application startup.
[2022-07-10 23:49:59 +0900] [64039] [INFO] Application startup complete.
^C[2022-07-10 23:50:49 +0900] [63819] [INFO] Handling signal: int
[2022-07-10 23:50:49 +0900] [64039] [ERROR] Traceback (most recent call last):
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
self.run_forever()
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
self._run_once()
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 1845, in _run_once
event_list = self._selector.select(timeout)
File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/selectors.py", line 562, in select
kev_list = self._selector.control(None, max_ev, timeout)
File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/gunicorn/workers/base.py", line 198, in handle_quit
sys.exit(0)
SystemExit: 0
...
Testing other signals such as TTIN, TTOU, QUIT, INT, TERM on gunicorn's signal handling page (https://docs.gunicorn.org/en/stable/signals.html) seem to work fine.
I have no idea if these two lines from gunicorn worker base class https://github.com/benoitc/gunicorn/blob/027f04b4b4aee4f50b980a7158add0feaf4c1b29/gunicorn/workers/base.py#L185 are needed though. It seem to work fine without it. So I excluded it for now:
if hasattr(signal, 'set_wakeup_fd'):
signal.set_wakeup_fd(self.PIPE[1])
Thanks.
kindly requesting a review from our last modifier @tiangolo 🙏 (original modification ref issue page: https://github.com/encode/uvicorn/issues/894)