contour icon indicating copy to clipboard operation
contour copied to clipboard

HTTPProxy: add cluster outlierDetection

Open yangyy93 opened this issue 2 years ago • 28 comments

add cluster outlierDetection. ref: #5317 design: #5460

yangyy93 avatar Jul 24 '23 08:07 yangyy93

Codecov Report

Attention: Patch coverage is 75.29412% with 21 lines in your changes missing coverage. Please review.

Project coverage is 81.53%. Comparing base (5f1b981) to head (3a2f324). Report is 166 commits behind head on main.

Files Patch % Lines
internal/dag/policy.go 67.56% 6 Missing and 6 partials :warning:
internal/dag/httpproxy_processor.go 33.33% 3 Missing and 1 partial :warning:
internal/envoy/cluster.go 77.77% 2 Missing and 2 partials :warning:
cmd/contour/serve.go 50.00% 1 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5575      +/-   ##
==========================================
- Coverage   81.56%   81.53%   -0.04%     
==========================================
  Files         133      133              
  Lines       15801    15886      +85     
==========================================
+ Hits        12888    12952      +64     
- Misses       2617     2629      +12     
- Partials      296      305       +9     
Files Coverage Δ
cmd/contour/servecontext.go 85.98% <100.00%> (+0.03%) :arrow_up:
internal/dag/dag.go 98.40% <ø> (ø)
internal/envoy/v3/cluster.go 96.62% <100.00%> (+0.29%) :arrow_up:
internal/protobuf/helpers.go 92.30% <100.00%> (+0.64%) :arrow_up:
pkg/config/parameters.go 88.03% <ø> (ø)
cmd/contour/serve.go 22.74% <50.00%> (+0.08%) :arrow_up:
internal/dag/httpproxy_processor.go 91.11% <33.33%> (-0.30%) :arrow_down:
internal/envoy/cluster.go 94.66% <77.77%> (-5.34%) :arrow_down:
internal/dag/policy.go 93.57% <67.56%> (-1.96%) :arrow_down:

codecov[bot] avatar Jul 24 '23 09:07 codecov[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Aug 16 '23 00:08 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Sep 06 '23 00:09 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Oct 06 '23 00:10 github-actions[bot]

not stale

yangyy93 avatar Oct 08 '23 01:10 yangyy93

Drive by comment:

Can we also add https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/bootstrap/v3/bootstrap.proto#config-bootstrap-v3-clustermanager-outlierdetection

From the Envoy docs

A log of outlier ejection events can optionally be produced by Envoy. This is extremely useful during daily operations since global stats do not provide enough information on which hosts are being ejected and for what reasons. The log is structured as protobuf-based dumps of OutlierDetectionEvent messages. Ejection event logging is configured in the Cluster manager outlier detection configuration.

Which makes the above o11y seem like a good idea

davinci26 avatar Oct 23 '23 12:10 davinci26

Also it would be nice if we could have this also applied to extension services as well such ext_auth

davinci26 avatar Oct 24 '23 19:10 davinci26

Drive by comment:

Can we also add https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/bootstrap/v3/bootstrap.proto#config-bootstrap-v3-clustermanager-outlierdetection

From the Envoy docs

A log of outlier ejection events can optionally be produced by Envoy. This is extremely useful during daily operations since global stats do not provide enough information on which hosts are being ejected and for what reasons. The log is structured as protobuf-based dumps of OutlierDetectionEvent messages. Ejection event logging is configured in the Cluster manager outlier detection configuration.

Which makes the above o11y seem like a good idea

thanks for your review, this is a good idea, I will add a switch to control whether logging is turned on or not. outlier ejection events

yangyy93 avatar Oct 27 '23 02:10 yangyy93

Also it would be nice if we could have this also applied to extension services as well such ext_auth

Now I just add outlier detection configuration on the services to be routed. I think we can discuss whether we should add relevant configuration on extension services.

yangyy93 avatar Oct 27 '23 02:10 yangyy93

Now I just add outlier detection configuration on the services to be routed. I think we can discuss whether we should add relevant configuration on extension services.

Works for me!

davinci26 avatar Oct 27 '23 12:10 davinci26

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Nov 11 '23 00:11 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Nov 29 '23 00:11 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Dec 19 '23 00:12 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Jan 04 '24 00:01 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Jan 27 '24 00:01 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Feb 20 '24 00:02 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Mar 13 '24 00:03 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Apr 12 '24 00:04 github-actions[bot]

any progress?

izturn avatar Apr 28 '24 06:04 izturn

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar May 13 '24 00:05 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Jun 03 '24 00:06 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Jul 03 '24 00:07 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Jul 19 '24 00:07 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Aug 03 '24 00:08 github-actions[bot]

The Contour project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 14d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the PR is closed

You can:

  • Ensure your PR is passing all CI checks. PRs that are fully green are more likely to be reviewed. If you are having trouble with CI checks, reach out to the #contour channel in the Kubernetes Slack workspace.
  • Mark this PR as fresh by commenting or pushing a commit
  • Close this PR
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] avatar Aug 18 '24 00:08 github-actions[bot]