armeria icon indicating copy to clipboard operation
armeria copied to clipboard

fix: Provide a way to customize the graceful shutdown delay of Netty `EventLoopGroup`s

Open novoj opened this issue 1 month ago โ€ข 6 comments

Added parameters to enable setting the quiet period and timeout for graceful shutdown of worker groups, both globally and through the ClientFactoryBuilder.

Make use of graceful shutdown settings on worker groups on the server side.

Refs: #5813

Motivation:

My E2E integration tests are running slowly just because they wait for default graceful shutdown timeout to expire.

Modifications:

  • made use of existing graceful shutdown settings on the server side
  • added new graceful (missing) shutdown settings on the client side

Result:

  • Closes #5813.
  • this pull request has no consequences for the user (#IMHO) - as a default values I configured the defaults that are applied by Netty default implementation
  • user can now explicitly configure zero timeout and quiet period both for server and client side, which results in immediate closing of both and speeding up the termination phase considerably

novoj avatar Nov 30 '25 21:11 novoj

[!WARNING]

Rate limit exceeded

@novoj has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 17 seconds before requesting another review.

โŒ› How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

๐Ÿšฆ How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

๐Ÿ“ฅ Commits

Reviewing files that changed from the base of the PR and between 81e5b8787f56c36a757b9883cf4be7c467fc3e51 and 1ae41f14293d52bcc7eab11122c2b746546a1229.

๐Ÿ“’ Files selected for processing (1)
  • core/src/main/java/com/linecorp/armeria/client/ClientFactoryOptions.java (3 hunks)

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds configurable graceful-shutdown quiet period and timeout (milliseconds) for Netty EventLoopGroups used by ClientFactory and Server via new builder methods, options, flags, default providers, and by calling shutdownGracefully(quiet, timeout, TimeUnit.MILLISECONDS).

Changes

Cohort / File(s) Summary
Client Factory Builder API
core/src/main/java/com/linecorp/armeria/client/ClientFactoryBuilder.java
Added fluent setters: workerGroupGracefulShutdownQuietPeriod(Duration), workerGroupGracefulShutdownQuietPeriodMillis(long), workerGroupGracefulShutdownTimeout(Duration), workerGroupGracefulShutdownTimeoutMillis(long). Duration overloads delegate to millis variants, validate inputs, and set new ClientFactoryOptions keys.
Client Factory Options
core/src/main/java/com/linecorp/armeria/client/ClientFactoryOptions.java
Added ClientFactoryOption<Long> fields: WORKER_GROUP_GRACEFUL_SHUTDOWN_QUIET_PERIOD_MILLIS, WORKER_GROUP_GRACEFUL_SHUTDOWN_TIMEOUT_MILLIS and instance accessors workerGroupGracefulShutdownQuietPeriodMillis() and workerGroupGracefulShutdownTimeoutMillis().
HTTP Client Factory Implementation
core/src/main/java/com/linecorp/armeria/client/HttpClientFactory.java
Replaced parameterless shutdownGracefully() with shutdownGracefully(quietPeriod, timeout, TimeUnit.MILLISECONDS) using configured values; attaches listener to log failure and complete the outer future regardless of outcome.
Graceful Shutdown Defaults & Providers
core/src/main/java/com/linecorp/armeria/common/Flags.java, core/src/main/java/com/linecorp/armeria/common/FlagsProvider.java, core/src/main/java/com/linecorp/armeria/common/DefaultFlagsProvider.java
Introduced default constants and provider accessors for client worker-group graceful-shutdown quiet period and timeout (millis); Flags exposes new default accessors; DefaultFlagsProvider supplies default values.
Server Shutdown Logic
core/src/main/java/com/linecorp/armeria/server/Server.java
Updated worker and boss group shutdown calls to shutdownGracefully(quietPeriod, timeout, TimeUnit.MILLISECONDS) using server configuration values instead of no-argument calls.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant App as Application (Server/ClientFactory)
    participant Flags as Flags/Options
    participant Builder as Builder / ClientFactoryOptions
    participant ELG as EventLoopGroup (worker/boss)
    participant Logger as Logger/Future

    App->>Builder: read graceful shutdown settings
    Builder->>Flags: resolve defaults / overridden values
    App->>ELG: call shutdownGracefully(quietMillis, timeoutMillis, MILLISECONDS)
    ELG-->>App: returns Future
    ELG->>Logger: onFailure: log warning
    ELG-->>App: complete Future (success or failure)

Estimated code review effort

๐ŸŽฏ 3 (Moderate) | โฑ๏ธ ~25 minutes

  • Review propagation of defaults from FlagsProvider โ†’ Flags โ†’ DefaultFlagsProvider โ†’ ClientFactoryOptions โ†’ builders.
  • Verify input validation (null Duration, non-negative millis) and unit conversions.
  • Confirm shutdown invocation uses consistent units and that listener semantics in HttpClientFactory correctly complete the outer future and log failures.

Poem

๐Ÿฐ I nudged the clocks from seconds to millis,
Quiet as carrots, swift as the hills.
Builders now whisper the shutdown decree,
Threads tuck in gently โ€” quick, light, and free. ๐Ÿฅ•โœจ

Pre-merge checks and finishing touches

โŒ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage โš ๏ธ Warning Docstring coverage is 78.95% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
โœ… Passed checks (4 passed)
Check name Status Explanation
Title check โœ… Passed The title accurately captures the main change: adding configuration for Netty EventLoopGroup graceful shutdown behavior.
Description check โœ… Passed The description clearly relates to the changeset by explaining the motivation, modifications, and results of adding graceful shutdown customization to both server and client sides.
Linked Issues check โœ… Passed The PR implements the core requirements from #5813: adds graceful shutdown configuration for EventLoopGroups via ClientFactoryBuilder and applies it on both server and client sides.
Out of Scope Changes check โœ… Passed All changes are directly related to implementing graceful shutdown configuration for Netty EventLoopGroups as specified in #5813, with no extraneous modifications.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

โค๏ธ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 30 '25 21:11 coderabbitai[bot]

๐Ÿ” Build Scanยฎ (commit: 1ae41f14293d52bcc7eab11122c2b746546a1229)

Job name Status Build Scanยฎ
build-ubicloud-standard-16-jdk-8 โŒ (failure) https://ge.armeria.dev/s/y5tj7vr76qu6y
build-ubicloud-standard-16-jdk-21-snapshot-blockhound โœ… https://ge.armeria.dev/s/jgqewnpmn5ax2
build-ubicloud-standard-16-jdk-17-min-java-17-coverage โœ… https://ge.armeria.dev/s/pdpfbewunxih6
build-ubicloud-standard-16-jdk-17-min-java-11 โœ… https://ge.armeria.dev/s/4hgb23vhhfwcg
build-ubicloud-standard-16-jdk-17-leak โœ… https://ge.armeria.dev/s/zyifltldde2ew
build-ubicloud-standard-16-jdk-11 โœ… https://ge.armeria.dev/s/zqmwb3qbpbt44
build-macos-latest-jdk-21 โœ… https://ge.armeria.dev/s/5tesuh7edbklw

github-actions[bot] avatar Dec 01 '25 02:12 github-actions[bot]

Generalizing GracefulShutdown sounds a good idea. In addition, it would also be great if toException() could be applied to the client's pending requests.

Currently, when a ClientFactory is closed, all channels are closed accordingly, which causes the pending requests to end with ClosedSessionException. I think it would be clearer to use a ShuttingDownException instead, so that users can see that the in-flight requests were terminated explicitly by closing the ClientFactory, rather than a transport error. toException could be used an extension point to customize the default behavior.

ikhoon avatar Dec 01 '25 06:12 ikhoon

I like the approach here and the feedback from @jrhee17 and @ikhoon. Thanks for bringing back to the table, @novoj! ๐Ÿ™‡

trustin avatar Dec 02 '25 07:12 trustin

I'd be grateful if @jrhee17 or @ikhoon help me to carve out an idea where to start with the changes and where to look. I'm newbie to Armeria codebase. Pinpointing key classes / methods which would be impacted by the change would help a lot.

novoj avatar Dec 02 '25 07:12 novoj

I think it's best to start with the direct issue you would like to solve - specifying GracefulShutdown for ClientFactory#workerGroup.

What do you think of starting with adding the following APIs in ClientFactoryBuilder.java, ClientFactoryOption.java?

    public ClientFactoryBuilder workerGroup(EventLoopGroup workerGroup, boolean shutdownOnClose,
                                                                          GracefulShutdown gracefulShutdown) {
    }

    public ClientFactoryBuilder workerGroup(int numThreads, GracefulShutdown gracefulShutdown) {
    }
    public static final ClientFactoryOption<GracefulShutdown> WORKER_GROUP_GRACEFUL_SHUTDOWN =
            ClientFactoryOption.define("WORKER_GROUP_GRACEFUL_SHUTDOWN", Flags.workerGroupGracefulShutdown());

jrhee17 avatar Dec 03 '25 02:12 jrhee17