cloud-on-k8s icon indicating copy to clipboard operation
cloud-on-k8s copied to clipboard

Document how to set TCP retransmission timeout

Open itayB opened this issue 2 years ago • 3 comments

Proposal

Elasticsearch recommend TCP retransmission timeout as important system configuration. Can we get it out of the box in Elastic operator? Today, I had to add an initContainer to configure it manually:

initContainers:
  - name: tcp-transmission-settings
  image: "busybox:1.36.0"
  imagePullPolicy: IfNotPresent
  command:
  - sysctl
  - -w
  - net.ipv4.tcp_retries2=5
  securityContext:
    privileged: true

The issue originally starts here.

Environment

  • ECK version: 2.6.0

  • Kubernetes information: Cloud: EKS 1.22

itayB avatar Apr 18 '23 10:04 itayB

The main concern with doing this in the operator is that it requires a privileged container to set a node level systctl (not 💯 sure whether this particular one is namespaced, but at least net.ipv4.tcp_retries2 is not whitelisted). So we generally prefer that users make the explicit choice of changing things like this with a privileged init container (or a DaemonSet that does it for all nodes in the cluster)

pebrc avatar Apr 23 '23 16:04 pebrc

The main concern with doing this in the operator is that it requires a privileged container to set a node level systctl (not 💯 sure whether this particular one is namespaced, but at least net.ipv4.tcp_retries2 is not whitelisted). So we generally prefer that users make the explicit choice of changing things like this with a privileged init container (or a DaemonSet that does it for all nodes in the cluster)

Thanks for your answer @pebrc ! Note that a similar recommendation/idea (privileged container) is documented here. Don't you think that it worth at least recommend it in the documentations as well?

itayB avatar Apr 30 '23 13:04 itayB

Sure we can add a section to the documentation mentioning this.

pebrc avatar May 02 '23 09:05 pebrc