contour
contour copied to clipboard
Added support for overload manager
This change adds minimal support for Envoy's overload manager to avoid cases where Envoy process is terminated by the out-of-memory killer, which results in traffic distrubances.
This PR proposes that administrator can (optionally) configure maximum amount of heap that Envoy is allowed to reserve. It does not allow the overload actions to be added or configured in any way. Instead, it configures default actions which are set according to example in "Configuring Envoy as an edge proxy" best practices doc: shrink_heap
action is executed when 95% and stop_accepting_requests
action when 98% of configured maximum heap is reached.
The configuration of maximum heap is (unfortunately) again a new command line flag. This is the same for all other bootstrap paramters so far as well. The reasoning is that contour bootsrap
is executed inside Envoy init container, where we do not have Contour config file or capability to read ContourConfiguration
CR from the API server. Or at least we have not done this so far.
There is a major conflict between overload manager and how we expose /ready
and /stats
by setting up a proxy to serve these endpoints!
While the real admin API at /admin/admin.sock
still works during overload, the requests via the "proxied versions" of /ready
and /stats
served via TCP socket will be rejected when stop_accepting_requests
is active. As a result, Envoy will be removed from the endpoints of the service. Since the Envoy instance was not accepting new requests anyways, maybe this will not make the overload any worse for the other Envoys. However, another side-effect is that administrator cannot monitor the Envoy instance anymore since stats endpoint will not be served either. Especially the memory related metrics would be of interest, since metrics will show that the stop_accepting_requests
action is active and the heap numbers explaining why. When admin sets max heap too low, they will not be able to find that out by checking metrics - since metrics are not served due to heap being low :thinking:
The feature itself seems very useful as it can avoid the OOM killer but I'd like to hear your opinion about the limitations.
Fixes #1794
Signed-off-by: Tero Saarni [email protected]
As a workaround, I found out following commands helpful to access the "real" admin API when the "proxied" admin API endpoints are rejecting requests due to overload:
sudo curl --silent --unix-socket /proc/$(pidof envoy)/root/admin/admin.sock http://localhost/stats | grep -E "^overload|^server.memory"
sudo curl --silent --unix-socket /proc/$(pidof envoy)/root/admin/admin.sock http://localhost/memory # tcmalloc metrics
These need to be executed on the worker node, or just on the dev host when running Kind.
Envoy's fixed_heap
monitor uses the tcmalloc metrics at /memory
and following formula to calculate the overload percentage: (heap_size - pageheap_unmapped) / maximum_heap
Codecov Report
Merging #4597 (f10ff6b) into main (d8553a8) will increase coverage by
0.14%
. The diff coverage is97.50%
.
Additional details and impacted files
@@ Coverage Diff @@
## main #4597 +/- ##
==========================================
+ Coverage 76.08% 76.23% +0.14%
==========================================
Files 140 140
Lines 13073 13147 +74
==========================================
+ Hits 9947 10023 +76
+ Misses 2872 2871 -1
+ Partials 254 253 -1
Impacted Files | Coverage Δ | |
---|---|---|
cmd/contour/bootstrap.go | 0.00% <0.00%> (ø) |
|
internal/envoy/bootstrap.go | 55.88% <ø> (ø) |
|
internal/envoy/v3/bootstrap.go | 94.33% <100.00%> (+0.82%) |
:arrow_up: |
internal/dag/dag.go | 95.53% <0.00%> (ø) |
|
internal/envoy/v3/route.go | 73.95% <0.00%> (+0.21%) |
:arrow_up: |
internal/sorter/sorter.go | 98.79% <0.00%> (+0.60%) |
:arrow_up: |
internal/dag/httpproxy_processor.go | 92.80% <0.00%> (+0.65%) |
:arrow_up: |
I think in cases that the heap is low, getting metrics is definitely less of a big deal than being oomkilled. It seems like this is about as good a compromise as we're going to be able to do, sadly.
It's unfortunate that we have to make the heap size a bootstrap cmdline param, but I don't see any other way to do it.
I also think that this feature has to come with a bunch of warnings about being careful with your sizing, making sure that it matches up with any Pod requests and limits you've put on Envoy, and so on.
I'll give the PR a more detailed review soon, sorry about the delay @tsaarni.
Marking this PR stale since there has been no activity for 14 days. It will be closed if there is no activity for another 30 days.
Marking this PR stale since there has been no activity for 14 days. It will be closed if there is no activity for another 30 days.
I will come back with some documentation shortly.
Rebased and missing documentation site/content/docs/main/config/overload-manager.md
added.
Marking this PR stale since there has been no activity for 14 days. It will be closed if there is no activity for another 30 days.
This PR is ready for review.
Sorry for the delay on this @tsaarni, planning to take a look soon!
Thanks @sunjayBhatia for the review!
this and the below might need an update to match the bootstrap config (looks like 90% and 98% for these two actions)
Thanks for spotting this! I went the other direction and changed the bootstrap config to match the documentation, since values came from Envoy best practices document. I think I had no real reason to use 90% instead of 95%.
If there are no further questions, I'll merge this tomorrow.
@skriss Thank you for the review!
Just a couple tiny things but this looks good to me. The issue with the admin endpoint is unfortunate, and maybe we can do some more thinking there on if there's something else we can do, but I don't think it needs to block getting the initial PR in, seems like a net improvement.
I agree. I could not figure anything that could be done on Contour side, except removing the proxy from the admin API, but that is there for a reason. Overload manager cannot be applied directly on named listeners only, or the other way around: it cannot be configured to ignore certain listeners...