Implement graceful shutdown for Garnet server
This pull request introduces a comprehensive and robust graceful shutdown mechanism for the Garnet server, both for Windows service and console application scenarios. The changes ensure that when a shutdown is requested (via service stop, Ctrl+C, or process exit), the server stops accepting new connections, waits for existing connections to finish, commits all data (AOF), and takes a checkpoint if necessary, all within a configurable timeout. This significantly improves data durability and operational reliability during shutdowns.
Mainly goal is Close #1382 and Resolve #1390 This PR reflect https://github.com/microsoft/garnet/pull/1383#discussion_r2535724513
Graceful Shutdown Implementation
- Added a new
ShutdownAsyncmethod tomain/GarnetServer/GarnetServerthat orchestrates the graceful shutdown process: stops accepting new connections, waits for active connections to finish (with timeout), commits AOF, and takes a checkpoint if tiered storage is enabled. - Modified the Windows service (
Worker.StopAsync) and console app (Program.Main) to use the newShutdownAsyncmethod, ensuring consistent and graceful shutdown behavior in both entrypoints. [1] [2]
Server Interface and Networking Enhancements
- Extended the server interface (
libs/server/Servers/IGarnetServer) and base classes to support stopping listening for new connections via a newStopListeningmethod, and implemented this for TCP servers to close the listen socket cleanly. [1] [2] [3]
Data Durability and Checkpointing
- Added new APIs in
libs/server/Servers/StoreApito take a checkpoint and to check if AOF or storage tier is enabled, supporting the shutdown flow for data durability.
Infrastructure and Code Quality Improvements
- Ensured proper disposal patterns and resource cleanup, including calling
base.Dispose()and suppressing finalization. - Updated using directives and minor code structure for clarity and consistency. [1] [2] [3]
Configuration and Timeout Handling
- Added configuration for shutdown timeout in the Windows service host, defaulting to 5 seconds for graceful shutdown.
These changes collectively make server shutdowns safer and more reliable, reducing the risk of data loss or corruption during restarts or deployments.