contour
contour copied to clipboard
Investigate enabling overload manager for Envoy
As mentioned in #1375, Envoy has an overload manager that we should investigate.
It's pretty generic by design, but currently is effectively a memory (that is, heap size) manager. You can configure a maximum heap size, and then different actions to take at different percentages of that heap size.
Sample from the Envoy PR:
overload_manager:
refresh_interval: 0.25s
resource_monitors:
- name: "envoy.resource_monitors.fixed_heap"
config:
# TODO: Tune for your system.
max_heap_size_bytes: 2147483648 # 2 GiB
actions:
- name: "envoy.overload_actions.shrink_heap"
triggers:
- name: "envoy.resource_monitors.fixed_heap"
threshold:
value: 0.95
- name: "envoy.overload_actions.stop_accepting_requests"
triggers:
- name: "envoy.resource_monitors.fixed_heap"
threshold:
value: 0.98
@youngnick I've moved this to the backlog. I would prefer not to be doing any code changes apart from fixing exisiting bugs in the rc.2 - 1.0 time frame.
no problem, I was going to ask you where you wanted it.
Hello Everyone
This could be a very clever idea as this could avoid envoy being OOMKilled in installations where Memory Limits are configured.
There could be a serve flag:
- if it is not given: overload protection is not configured
- If it is given: Overload protection is configured with the heap size configured as given in the parameter
For example: --max-heap=512 would render: max_heap_size_bytes: 536870912 ( 512x1024x1024 )
Alternatively the actions and thresholds could be user configurable, but that could be harder to achieve.
I think when I raised this initially, I thought of allowing the thresholds to be tuned as well. To be honest, this feels more like a config setting to me, with the max-heap, and thresholds configurable. I do like the idea though.
We're actively trying to cut the number of command-line flags in favor of either config file or ContourConfig fields, so I'm probably a -1 on adding another flag.
I think overload manager config is part of bootstrap so this would need to be implemented in boostrap subcommand, where all input is currently coming from command line flags. Also, since it is being executed in Envoy pod, it does not currently have access to the config file, or Kubernetes API server to read ContourConfig.