kubernetes-elasticsearch-cluster
kubernetes-elasticsearch-cluster copied to clipboard
Ensure cluster is in a green state before stopping a pod
The timeout is set to 8h before releasing the hook and forcing ES node to shutdown.
What happens if I'm deleting the deployment?
Good question, I didn't test. Anyway, I think you can force a delete to bypass hooks.
👍 .. any plans merging this?
Works for me
@deimosfr what about actively deallocating shards off that node as part of the lifecycle hook? (e.g. setting exclude._ip and waiting for the node to become empty?
With validation webhooks, it may be possible but it's a far-fetched thing to do here. Maybe an operator feature request?
Regarding the deallocation of shards in the preStop hook, does anyone have a working example? It would be a nice feature to have.
Could something like https://github.com/kayrus/elk-kubernetes/blob/master/docker/elasticsearch/pre-stop-hook.sh be used?
It is not working in my case, the data pod scaled from 3 to 1 without waiting for status to be "green".
@psalaberria002 @zhujinhe There are multiple ways to achieve this.
1. Delocate all shards before proceeding with the next one with preStart
and postStop
lifecycle hooks.
Here's my - slightly modified - working example which I originally took from https://github.com/helm/charts/blob/5cc1fd6c37f834949cf67c89fe23cf654a9bef77/incubator/elasticsearch/templates/configmap.yaml#L118
It's modified due to the fact that we are using the x-pack security features and therefore need encryption + authentication. You might remove the encryption (https://localhost) and the authentication part -u ${SOME_USER}:${SOME_PASSWORD}
Depending of the network performance and amount of data in the cluster this approach can take very long and decreases the cluster performance a lot due to the relocation of all shards prior each restart.
configmap.yaml
:
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Values.data.name }}-cm
labels:
app: {{ .Values.data.name }}
data:
pre-stop-hook.sh: |-
#!/bin/bash
set -uo
echo "Prepare to migrate data of the node ${NODE_NAME}"
echo "Move all data from node ${NODE_NAME}"
curl -k -u ${SOME_USER}:${SOME_PASSWORD} -s -XPUT -H 'Content-Type: application/json' 'https://localhost:9200/_cluster/settings' -d "{
\"transient\" :{
\"cluster.routing.allocation.exclude._name\" : \"${NODE_NAME}\"
}
}"
echo ""
while true ; do
echo -e "Wait for node ${NODE_NAME} to become empty"
SHARDS_ALLOCATION=$(curl -k -u ${SOME_USER}:${SOME_PASSWORD}} -s -XGET 'https://localhost:9200/_cat/shards')
if ! echo "${SHARDS_ALLOCATION}" | grep -E "${NODE_NAME}" | grep -v .security-*; then
echo -e "${NODE_NAME} has been evecuated"
break
fi
sleep 1
done
post-start-hook.sh: |-
#!/bin/bash
set -uo
while true; do
curl -k -u ${SOME_USER}:${SOME_PASSWORD} -XGET "https://localhost:9200/_cluster/health"
if [[ "$?" == "0" ]]; then
break
fi
echo -e "${NODE_NAME} not reachable, retrying ..."
sleep 1
done
echo ""
CLUSTER_SETTINGS=$(curl -k -u ${SOME_USER}:${SOME_PASSWORD} -s -XGET "https://localhost:9200/_cluster/settings")
if echo "${CLUSTER_SETTINGS}" | grep -E "${NODE_NAME}"; then
echo -e "Activate node ${NODE_NAME}"
curl -k -u elastic:${ES_BOOTSTRAP_PW} -s -XPUT -H 'Content-Type: application/json' "https:///localhost:9200/_cluster/settings" -d "{
\"transient\" :{
\"cluster.routing.allocation.exclude._name\" : null
}
}"
fi
echo -e "Node ${NODE_NAME} is ready to be used"
deployment.yaml
:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
namespace: {{ .Release.Namespace }}
name: {{ .Values.data.name }}
labels:
app: {{ .Values.data.name }}
spec:
serviceName: {{ .Values.data.name }}
replicas: {{ .Values.data.deployment.replicas }}
revisionHistoryLimit: {{ .Values.data.deployment.revisionHistoryLimit }}
podManagementPolicy: {{ .Values.data.deployment.podManagementPolicy }}
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: {{ .Values.data.name }}
annotations:
spec:
serviceAccount: {{ .Values.serviceAccount }}
securityContext:
runAsUser: {{ .Values.userId }}
fsGroup: {{ .Values.groupId }}
imagePullSecrets:
- name: {{ .Values.data.deployment.imagePullSecretName }}
securityContext:
runAsUser: {{ .Values.userId }}
fsGroup: {{ .Values.groupId }}
initContainers:
- name: {{ .Values.data.deployment.initContainers.increaseMapCount.name }}
image: "{{ .Values.image.os.repository }}:{{ .Values.image.os.tag }}"
imagePullPolicy: {{ .Values.image.os.pullPolicy }}
command:
- sh
- -c
- 'echo 262144 > /proc/sys/vm/max_map_count'
securityContext:
privileged: {{ .Values.data.deployment.initContainers.increaseMapCount.securityContext.privileged }}
runAsUser: {{ .Values.data.deployment.initContainers.increaseMapCount.securityContext.runAsUser }}
containers:
- name: {{ .Values.data.shortName }}
image: "{{ .Values.image.elasticsearch.repository }}:{{ .Values.image.elasticsearch.tag }}"
imagePullPolicy: {{ .Values.image.elasticsearch.pullPolicy }}
securityContext:
capabilities:
add:
- IPC_LOCK
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /usr/bin/curl -k -u ${USERNAME}:${PASSWORD} "https://localhost:9200/_cluster/health?local=true"
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
lifecycle:
preStop:
exec:
command: ["/bin/bash","/pre-stop-hook.sh"]
postStart:
exec:
command: ["/bin/bash","/post-start-hook.sh"]
volumeMounts:
- name: lifecycle-hooks
mountPath: /pre-stop-hook.sh
subPath: pre-stop-hook.sh
- name: lifecycle-hooks
mountPath: /post-start-hook.sh
subPath: post-start-hook.sh
terminationGracePeriodSeconds: 86400
volumes:
- name: lifecycle-hooks
configMap:
name: {{ .Values.data.name }}-cm
The deployment.yaml
is not the full file, it has only the required parts for the lifecycle hooks.
2. Just ensure the containers are only being stopped as long the cluster is in green state
Use a readiness probe or a preStop hook.
readinessProbe
:
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /usr/bin/curl -k -u ${SOME_USER}:${SOME_PASSWORD} "https://localhost:9200/_cluster/health?wait_for_status=green&timeout=30s" | grep -v \"timed_out:\"true
preStop
:
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- /usr/bin/curl -k -u ${SOME_USER}:${SOME_PASSWORD} "https://localhost:9200/_cluster/health?wait_for_status=green&timeout=28800s
It's important to also set terminationGracePeriodSeconds: 28800
otherwise the container will be killed after 30s since this is the default timeout.