kubernetes-elasticsearch-cluster
kubernetes-elasticsearch-cluster copied to clipboard
Kubernetes limitation!! init-containers is executed only when pod is created
pod.beta.kubernetes.io/init-containers: '[
{
"name": "sysctl",
"image": "busybox",
"imagePullPolicy": "IfNotPresent",
"command": ["sysctl", "-w", "vm.max_map_count=262144"],
"securityContext": {
"privileged": true
}
}
]'
Problem with current cluster setup is that init-containers is executed only when pod is created to node. However, if I restart kubernetes nodes behind this I will get error: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144] This means that 1) init-containers should be executed always when pod tries restart, is this possible? 2) we should add command permanently to sysctl
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
If the Pod is restarted, all Init Containers must execute again.
Please consider re-opening due to - kubernetes/kubernetes#36485
Has anyone figured out a work around for this?
@schonfeld I have added that permanently to all kubernetes nodes. It solves that and now people can run their clusters.
@zetaab yes yes, we have too -- but when a node fails and restarts, the pod will enter an infinite crash loop backoff state... At least until you manually delete that pod, and the RC spins up a new one... Right?
Have you tried adding a DaemonSet that does the same thing as this init-container?
@pires that actually sounds like a great idea. Is that what other folks are doing?
I just created/added this DS to our cluster. Seems to do the trick. Using google's startup-script container cos busybox'ing it would just go into a restart loop. I'll keep this thread posted if we encounter any issues:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
name: es-ds
name: es-ds
spec:
selector:
matchLabels:
name: es-ds
template:
metadata:
labels:
name: es-ds
spec:
containers:
- name: es-ds
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#! /bin/bash
sysctl -w vm.max_map_count=262144
echo "done"
@schonfeld well when node fails and it goes to infinite loop.. it sounds that you have not added it permanently.
If you restart one node and write sysctl vm.max_map_count, you should see correct value. If you do not see that, it is not added permanently. This must be done outside kubernetes, it does not work with init containers. You can add that value permanently by editing /etc/sysctl.conf
The DaemonSet should guarantee that this is set at the host level every time on reboot. If you do so, you won't need the initi containers no more.
@schonfeld the pod spec is missing hostPID: true - which is critical for re-running when a node restarts.
see: https://github.com/kubernetes/contrib/blob/87cfb9b24f4491c5e5b04dc17b5ae2f3c3500f26/startup-script/manage-startup-script.sh
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
k8s-app: sysctl-conf
name: sysctl-conf
spec:
template:
metadata:
labels:
k8s-app: sysctl-conf
spec:
containers:
- command:
- sh
- -c
- sysctl -w vm.max_map_count=262166 && while true; do sleep 86400; done
image: busybox:1.26.2
name: sysctl-conf
resources:
limits:
cpu: 10m
memory: 50Mi
requests:
cpu: 10m
memory: 50Mi
securityContext:
privileged: true
terminationGracePeriodSeconds: 1
nice finding @mindw !! And to close the loop, this works on on K8 1.6.x +
https://github.com/kubernetes/kubernetes/issues/44041
I have config below to setup my cluster. Setting max_map_count fails on kubernetes 1.11. However, it works fine after I downgrade to kubernetes 1.9 with exactly same config.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
name: es-ds
name: es-ds
spec:
selector:
matchLabels:
name: es-ds
template:
metadata:
labels:
name: es-ds
spec:
hostPID: true
containers:
- name: es-ds
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#! /bin/bash
sysctl -w vm.max_map_count=262144
echo "done"
@jiashenC look at https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/
I think that will solve your 1.11 issue?