kube-state-metrics Some kube-state-metrics shards are serving up stale metrics

What happened:

We found some kube-state-metrics shards are serving up stale metrics.

For example, this pod is running and healthy:

$ kubectl get pods provider-kubernetes-a3cbbe355fa7-6d9d468f59-xbfsq
NAME                                                READY   STATUS    RESTARTS   AGE
provider-kubernetes-a3cbbe355fa7-6d9d468f59-xbfsq   1/1     Running   0          87m

However, we see for the past hour that kube_pod_container_status_waiting_reason is reporting it in CreatingContainer:

And to prove this is being served by KSM, we looked at the incriminating shard's (kube-state-metrics-5) /metrics endpoint and saw this metric is definitely stale:

kube_pod_container_status_waiting_reason{namespace="<redacted>",pod="provider-kubernetes-a3cbbe355fa7-678fd88bc5-76dw4",uid="<redacted>",container="package-runtime",reason="ContainerCreating"} 1

This is one such example, there seem to be several such situations.

What you expected to happen:

Expectation is that the metric(s) match reality

How to reproduce it (as minimally and precisely as possible):

Unfortunately, we're not quite sure when/why it gets into this state (anecdotally, it almost always happens when we upgrade KSM, though today there was no update besides some Prometheus agents)

We can mitigate the issue by restarting all the KSM shards... e.g.,

$ kubectl rollout restart -n kube-state-metrics statefulset kube-state-metrics

... if that's any clue to determine root cause.

Anything else we need to know?:

When I originally ran into the problem, I thought it had something to do with the Compatibility Matrix. But starting with KSM v2.11.0, I confirmed the client libraries are updated for my version of k8s (v1.28)
There's nothing out of the ordinary in the KSM logs:

Click to view kube-state-metrics-5 logs

I0409 08:17:51.349017       1 wrapper.go:120] "Starting kube-state-metrics"
W0409 08:17:51.349231       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0409 08:17:51.350019       1 server.go:199] "Used resources" resources=["limitranges","storageclasses","deployments","resourcequotas","statefulsets","cronjobs","endpoints","ingresses","namespaces","nodes","poddisruptionbudgets","mutatingwebhookconfigurations","replicasets","horizontalpodautoscalers","networkpolicies","validatingwebhookconfigurations","volumeattachments","daemonsets","jobs","services","certificatesigningrequests","configmaps","persistentvolumeclaims","replicationcontrollers","secrets","persistentvolumes","pods"]
I0409 08:17:51.350206       1 types.go:227] "Using all namespaces"
I0409 08:17:51.350225       1 types.go:145] "Using node type is nil"
I0409 08:17:51.350241       1 server.go:226] "Metric allow-denylisting" allowDenyStatus="Excluding the following lists that were on denylist: "
W0409 08:17:51.350258       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0409 08:17:51.350658       1 utils.go:70] "Tested communication with server"
I0409 08:17:52.420690       1 utils.go:75] "Run with Kubernetes cluster version" major="1" minor="28+" gitVersion="v1.28.6-eks-508b6b3" gitTreeState="clean" gitCommit="25a726351cee8ee6facce01af4214605e089d5da" platform="linux/amd64"
I0409 08:17:52.420837       1 utils.go:76] "Communication with server successful"
I0409 08:17:52.422588       1 server.go:350] "Started metrics server" metricsServerAddress="[::]:8080"
I0409 08:17:52.422595       1 server.go:339] "Started kube-state-metrics self metrics server" telemetryAddress="[::]:8081"
I0409 08:17:52.423030       1 server.go:73] levelinfomsgListening onaddress[::]:8080
I0409 08:17:52.423052       1 server.go:73] levelinfomsgTLS is disabled.http2falseaddress[::]:8080
I0409 08:17:52.423075       1 server.go:73] levelinfomsgListening onaddress[::]:8081
I0409 08:17:52.423093       1 server.go:73] levelinfomsgTLS is disabled.http2falseaddress[::]:8081
I0409 08:17:55.422262       1 config.go:84] "Using custom resource plural" resource="autoscaling.k8s.io_v1_VerticalPodAutoscaler" plural="verticalpodautoscalers"
I0409 08:17:55.422479       1 discovery.go:274] "discovery finished, cache updated"
I0409 08:17:55.422544       1 metrics_handler.go:106] "Autosharding enabled with pod" pod="kube-state-metrics/kube-state-metrics-5"
I0409 08:17:55.422573       1 metrics_handler.go:107] "Auto detecting sharding settings"
I0409 08:17:55.430380       1 metrics_handler.go:82] "Configuring sharding of this instance to be shard index (zero-indexed) out of total shards" shard=5 totalShards=16
I0409 08:17:55.431104       1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=["kube_customresource_vpa_containerrecommendations_target","kube_customresource_vpa_containerrecommendations_target"]
I0409 08:17:55.431143       1 builder.go:282] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,autoscaling.k8s.io/v1, Resource=verticalpodautoscalers"
I0416 16:47:01.423216       1 config.go:84] "Using custom resource plural" resource="autoscaling.k8s.io_v1_VerticalPodAutoscaler" plural="verticalpodautoscalers"
I0416 16:47:01.423283       1 config.go:209] "reloaded factory" GVR="autoscaling.k8s.io/v1, Resource=verticalpodautoscalers"
I0416 16:47:01.423466       1 builder.go:208] "Updating store" GVR="autoscaling.k8s.io/v1, Resource=verticalpodautoscalers"
I0416 16:47:01.423499       1 discovery.go:274] "discovery finished, cache updated"
I0416 16:47:01.423527       1 metrics_handler.go:106] "Autosharding enabled with pod" pod="kube-state-metrics/kube-state-metrics-5"
I0416 16:47:01.423545       1 metrics_handler.go:107] "Auto detecting sharding settings"

This may be related to https://github.com/kubernetes/kube-state-metrics/issues/2355 but I'm not sure about that linked PR to decide conclusively.

Environment:

kube-state-metrics version: v2.12.0 (this has occurred in previous versions too)
Kubernetes version (use kubectl version): v1.28.6
Cloud provider or hardware configuration: EKS
Other info:

Apr 16 '24 21:04 schahal

qq: have your Statefulset labels been changed?

Apr 17 '24 21:04 CatherineF-dev

have your Statefulset labels been changed?

For this particular case, we don't suspect they'd changed (tho we drop the metric to confirm this 100%).

But for other cases that we run into this issue, almost always the labels get changed, particularly the chart version when we upgrade:

Labels:             app.kubernetes.io/component=metrics
                    app.kubernetes.io/instance=kube-state-metrics
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=kube-state-metrics
                    app.kubernetes.io/part-of=kube-state-metrics
                    app.kubernetes.io/version=2.12.0
                    helm.sh/chart=kube-state-metrics-5.18.1
                    release=kube-state-metrics

Apr 18 '24 16:04 schahal

/assign @CatherineF-dev /triage accepted

Apr 18 '24 16:04 logicalhan

But for other cases that we run into this issue, almost always the labels get changed, particularly the chart version when we upgrade:

This is related to https://github.com/kubernetes/kube-state-metrics/pull/2347

For this particular case, we don't suspect they'd changed (tho we drop the metric to confirm this 100%).

This is a new issue.

Apr 19 '24 13:04 CatherineF-dev

This is related to https://github.com/kubernetes/kube-state-metrics/pull/2347

For the purposes of this issue, I think it's wholly related to https://github.com/kubernetes/kube-state-metrics/pull/2347 (the one time we claimed the statefulset may not have changed labels, we had no proof of that).

IMO, we can track this issue to that PR for closure (and if we do see another case of stale metrics, we can try to gather those exact circumstances in a, if needed, separate issue)

May 06 '24 23:05 schahal

Looks like this will be resolved in v2.13.0

Jul 18 '24 19:07 LaikaN57

For tracking puposes, this problem still persists (even in the latest version).

This may lend credence to the one time we ran into this issue and claimed there was no label change: So I believe https://github.com/kubernetes/kube-state-metrics/issues/2431 is reporting the same issue.

The labels/versions for reference:

  Labels:           app=kube-state-metrics
                    app.kubernetes.io/component=metrics
                    app.kubernetes.io/instance=kube-state-metrics
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=kube-state-metrics
                    app.kubernetes.io/part-of=kube-state-metrics
                    app.kubernetes.io/version=2.13.0
                    helm.sh/chart=kube-state-metrics-5.24.0
                    release=kube-state-metrics
  Annotations:      kubectl.kubernetes.io/restartedAt: 2024-07-26TXX:XX:XX-XX:XX   # <--- the workaround
                    our-workaround-rev-to-trigger-shard-refresh: <someSHA>

Jul 27 '24 01:07 schahal

@schahal could you reproduce this issue consistently? If so, could you help provide detailed steps to reproduce it? You can anonymize pod name.

Jul 27 '24 14:07 CatherineF-dev

could you reproduce this issue consistently? If so, could you help provide detailed steps to reproduce it?

Aside from what's in the description, I feel like this consistently happens anytime the Statefulset is updated... e.g., :

I change our kube-state-metrics helm chart version from version: 5.25.0 to version: 5.25.1
I then let our continuous delivery pipeline (in this case ArgoCD), render the templates (e.g., helm template ....)
Those rendered templates are then applied to the Kubernetes cluster, updating the Statefulset

Invariably, right after that get shards with stale metrics - mitigated only by restarting the pods.

https://github.com/kubernetes/kube-state-metrics/issues/2431 and this slack thread have other perspectives from different users on the same symptom, which may shed some other light.

Aug 07 '24 23:08 schahal

kube-state-metrics kube-state-metrics copied to clipboard

Some kube-state-metrics shards are serving up stale metrics

kube-state-metrics
kube-state-metrics copied to clipboard