armeria icon indicating copy to clipboard operation
armeria copied to clipboard

Provide a way to detect unhealthy connections

Open ikhoon opened this issue 1 year ago • 1 comments

Motivation:

Currently, there is no extension point to detect errors for specific connections and terminate connections that are unhealthy.

Related: #5717 #5751

API design:

OutlierDetector is similar to CircuitBreaker, but the state is simpler, only one direction, and optimized for ephemeral resources such as connections and streams.

  • OutlierDetector determines whether the target is an outlier based on onSuccess() and onFailure() events.
  • OutlierDetectingRule is used to decide whether a request fails.
  • OutlierDetectionDecision is the result of OutlierDetectingRule.
    • OutlierDetectionDecision.FATAL is a special result that can immediately mark the target as an outlier.

Example:

OutlierDetectingRule rule =
  OutlierDetectingRule
    .builder()
    .onServerError()
    .onException(IOException.class)
    .onException(WriteTimeoutException, OutlierDetectionDecision.FATAL)
    .build();

OutlierDetection outlierDetection = 
  OutlierDetection
    .builder(rule)
    .counterSlidingWindow(10_seconds)
    .counterUpdateInterval(1_seconds)
    .failureRateThreshold(0.5)
    .build();

ClientFactory
  .builder()
  // Apply the OutlierDetection to detect and close unhealthy connections
  .connectionOutlierDetection(outlierDetection)

Modifications:

  • Add OutlierDetector, OutlierDetectingRule and OutlierDetectionDecision and their implementations to common.outlier package.
  • Move EventCounter, EventCount and SlidingWindowCounter to common.util package and expose EventCounter and EventCount as public APIs to minimize duplication.
    • SlidingWindowCounter can be created with EventCounter.ofSlidingWindow(...).
  • Unlike CircuitBreakerRule, OutlierDetectingRule returns a decision synchronously and is simplified to look at headers and causes. Because:
    • I didn't see content is necessary to detect an outlier.
    • The response status and exception type are sufficient information to determine the result.
  • Create an OutlierDetector in HttpSessionHandler
    • A hook applying OutlierDetector is added right after HttpSessionHandler.invoke() is called.
  • Deprecating) CircuitBreakerListener.onEventCountUpdated(String,.circuitbreaker.EventCount) has been deprecated in favor of CircuitBreakerListener.onEventCountUpdated(String,.util.EventCount).
  • Add ClientFactoryBuilder.connectionOutlierDetection(OutlierDetection) to detect unhealthy connection.
    • This option is disabled by default.

Result:

  • You can now use OutlierDetection to detect unhealthy connections and close them gracefully.
  • Closes #5751

ikhoon avatar Jun 14 '24 11:06 ikhoon

🔍 Build Scan® (commit: 76c38f1c380f4a63933784afc8c4ef28421f031b)

Job name Status Build Scan®
build-self-hosted-unsafe-jdk-8 https://ge.armeria.dev/s/a2l3qfp437e4g
build-self-hosted-unsafe-jdk-21-snapshot-blockhound https://ge.armeria.dev/s/rob6boeoowa5i
build-self-hosted-unsafe-jdk-17-min-java-17-coverage https://ge.armeria.dev/s/qb3yyvpwsxsgq
build-self-hosted-unsafe-jdk-17-min-java-11 https://ge.armeria.dev/s/r6eldo3vnos2w
build-self-hosted-unsafe-jdk-17-leak https://ge.armeria.dev/s/ds2dzip3zfckm
build-self-hosted-unsafe-jdk-11 https://ge.armeria.dev/s/wpojdg464ddu2
build-macos-12-jdk-21 https://ge.armeria.dev/s/fzw52exefj6xs

github-actions[bot] avatar Jun 14 '24 11:06 github-actions[bot]

It seems like OutlierDetectingRule is used alongside OutlierDetector, and both are retrieved and created from this OutlierDetection. What do you think of adding OutlierDetectingRule to OutlierDetector instead?

They are used together but the role of OutlierDetector and OutlierDetectingRule is different. OutlierDetector is designed as a utility. It may be used alone. So I didn't want to force users who only need OutlierDetector to implement OutlierDetectingRule.

ikhoon avatar Jul 04 '24 02:07 ikhoon

OutlierDetector is designed as a utility. It may be used alone.

Haven't thought about this case. Then, I'm fine with the current design. 👍

minwoox avatar Jul 04 '24 06:07 minwoox