docs icon indicating copy to clipboard operation
docs copied to clipboard

Add startupProbe to statefulsets

Open jhatcher9999 opened this issue 4 years ago • 2 comments

James Hatcher (jhatcher9999) commented:

There is a potential issue in our statefulsets (STS) which is as follows:

  • Our STSs have an updateStrategy of rollingUpdate which means when you make certain edits to the STS, the STS will begin to do a rolling update of its pods. Examples of these updates include changing the image version, labels, annotations.
  • When the STS cycles through the pods, the PodStatus moves from ContainerCreating to Initialized. Then, the livenessProbe kicks in and marks the pod as Running; and then the readinessProbe kicks in and marks the pod as Ready.
  • As soon as this state is reached, the STS moves forward and starts to terminate and update the next pod in the STS.
  • At this point from the CR cluster's standpoint, the node has joined back, but the cluster may still be resolving under-replicated ranges, etc. It would be better from the CRDB perspective to wait for these types of issues to be resolved and stable before moving to taking the next pod down -- especially when the cluster is under load.

To remedy this situation, I propose that we add a startupProbe to the STS. The startupProbe is supported in k8s 1.16+. The startupProbe, when defined, delays the start of the livenessProbe and readinessProbe. Once it exits successfully, then the other probes kick in. If it doesn't exit successfully, then the pod is terminated and is subject to its restartPolicy.

Here is a startupProbe that I tested successfully.

       startupProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - |
              for i in {1..30};
              do
                UR=$(/cockroach/cockroach sql \
                   --certs-dir=/cockroach/cockroach-certs/ \
                   -e "SELECT SUM((metrics->>'ranges.underreplicated')::DECIMAL)::INT8 AS ranges_underreplicated FROM crdb_internal.kv_store_status S INNER JOIN crdb_internal.gossip_liveness L ON S.node_id = L.node_id WHERE L.decommissioning <> true;" \
                   --format raw \
                   --host=cockroachdb-public | awk '{if(NR>3)print}' | awk '{if(NR==1)print}'
                     );
                echo "Under-replicated ranges: $UR" >> /usr/share/message;
                if [ -z "$UR" ];
                then
                  echo "No under-replicated ranges reported.  Sleeping for 10 seconds - iteration $i" >> /usr/share/message;
                  sleep 10;
                  continue;
                fi
                if [ $UR -gt 0 ];
                then
                  echo "Sleeping for 10 seconds - iteration $i" >> /usr/share/message;
                  sleep 10;
                else
                  echo "breaking out of loop" >> /usr/share/message;
                  break;
                fi
              done
              exit 0;
          failureThreshold: 1
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1

Here is a script that can be used to monitor pods in the STS as they are cycled:

echo "Node 1" && kubectl exec cockroachdb-1 -it -n <put NS here> --context <put context here> -- cat /usr/share/message && echo; \
echo "Node 2" && kubectl exec cockroachdb-2 -it -n <put NS here> --context <put context here> -- cat /usr/share/message && echo

A few things I haven't totally worked through that need further consideration and testing:

  1. How well does it handle long-running startups (for instance, downloading an image for the first time)?
  2. Various cluster configs (single-region, multi-region, single-node)
  3. How does it react when running on k8s versions prior to 1.16?
  4. Besides under-replicated ranges, are there other scenarios that the probe should consider? Resources available? Running jobs? Storage capacity? Gossip established with x% of the nodes in the cluster?

Jira Issue: DOC-1071

jhatcher9999 avatar Mar 29 '21 15:03 jhatcher9999

Tested the script on EKS 1.17 and found EKS doesn't support alpha features in 1.17. The StartupProbe is an alpha feature in 1.16 and became beta feature in 1.18 https://github.com/aws/containers-roadmap/issues/947

lin-crl avatar Mar 29 '21 20:03 lin-crl

linville (mdlinville) commented: Is the request here to update the actual StatefulSet? If so, this may need more than a DOC update.

exalate-issue-sync[bot] avatar Sep 25 '23 23:09 exalate-issue-sync[bot]