Investigate if we can replace `gunicornmontor` with `uvicorn.run()`
It is most likely that we no longer need gunicornmontor or UvicornMonitor anymore. @ashb 's suggestion is for Airflow uvicorn.run() should be enough.
Whoever takes this GitHub issue should verify the same and replace it if not needed.
The code:
- https://github.com/apache/airflow/blob/f38d56dbf4dc1639142fc5a494d5da24996a56cc/airflow/cli/commands/fastapi_api_command.py#L159-L190
- https://github.com/apache/airflow/blob/f38d56dbf4dc1639142fc5a494d5da24996a56cc/airflow/cli/commands/webserver_command.py#L49-L107
The current GunicornMonitor provides the following capabilities:
-
Automatic worker restarts if workers crash or hang:
Ensures that if a worker crashes or becomes unresponsive, it is automatically restarted. -
Graceful worker scaling and reloads: This allows for addition and removal of workers and reloads workers gracefully when needed.
-
Timeout management for unresponsive workers: Gunicorn monitors workers for unresponsiveness and can terminate them if they exceed a set timeout, preventing hangs.
If we switch to uvicorn.run(), we would lose these features since uvicorn.run() lacks built-in process management. Specifically:
If a worker dies, there's no master process to restart it. There will be no automatic scaling of workers, and no handling of worker timeouts or periodic restarts. To replicate this functionality, we would need an external process manager like systemd or supervisord, which adds additional complexity and overhead.
cc: @kaxil @ashb
For 2: https://docs.gunicorn.org/en/stable/signals.html
TTIN: Increment the number of processes by one TTOU: Decrement the number of processes by one
If a worker dies, there's no master process to restart it
Doesn't Gunicorn do that itself? https://docs.gunicorn.org/en/stable/design.html#master
The master process is a simple loop that listens for various process signals and reacts accordingly. It manages the list of running workers by listening for signals like TTIN, TTOU, and CHLD. TTIN and TTOU tell the master to increase or decrease the number of running workers. CHLD indicates that a child process has terminated, in this case the master process automatically restarts the failed worker.
So it's only the case of "worker hang" that might not be there anymore.Let me think
For 1: https://docs.gunicorn.org/en/stable/settings.html#timeout I think?
Just one comment here -> I've heard (but it's mostly through grapevine) that for quite a long time, uvicorn has the capability (and it's more and more recommended in production) - to manage multiple processes and handle sync requests directly - on their own and there is basically no need to use gunicorn at all.
Again it's more of "overheard" thing but looking at https://www.uvicorn.org/deployment/#using-a-process-manager , maybe that's what we are looking for? (or maybe I misunderstood what we want to do, just wanted to mention that gunicorn might not be needed at all maybe)
To perform the comparison. I replaced Gunicorn code in else block with below uvicorn.run command
uvicorn.run("airflow.api_fastapi.main:app", host=args.hostname, port=args.port, workers=num_workers,
timeout_keep_alive=worker_timeout, timeout_graceful_shutdown=worker_timeout, ssl_keyfile=ssl_key,
ssl_certfile=ssl_cert, access_log=access_logfile)
I used locust for performance testing with below configuration
These are the stats comparing uvicorn.run() with Gunicorn + GunicornMonitor
Comparison: Uvicorn vs. Gunicorn Performance
Request Statistics
| Metric | Uvicorn | Gunicorn |
|---|---|---|
| Total Requests | 14,714 | 14,726 |
| Total Failures | 0 | 13 |
| Average Response Time | 12.05 ms | 13.46 ms |
| Min Response Time | 7 ms | 1 ms |
| Max Response Time | 195 ms | 216 ms |
| Average Size (bytes) | 4,608 | 4,603.93 |
| Requests Per Second (RPS) | 49.05 | 49.09 |
| Failures Per Second | 0 | 0.04 |
Observations
-
Response Times:
- Uvicorn demonstrates slightly lower average and maximum response times compared to Gunicorn.
- Percentile analysis shows Uvicorn's response times are more consistent, with fewer extreme values at higher percentiles.
-
Failures:
- Uvicorn had no failures, whereas Gunicorn recorded 13 failures caused by
RemoteDisconnectederrors. This could indicate potential issues in connection handling under load.
- Uvicorn had no failures, whereas Gunicorn recorded 13 failures caused by
-
Performance Consistency:
- Uvicorn offers better consistency and reliability based on the above data.
Nice!.