Crash when deleting CRD
What happened: kube-state-metrics crashed after a CRD was deleted
What you expected to happen: to handle the deletion
How to reproduce it (as minimally and precisely as possible):
Delete a CRD such that a cache.DeletedFinalStateUnknown is returned
possibly between intervals?
Anything else we need to know?: Here's a link to the client doc for this type https://pkg.go.dev/k8s.io/client-go/tools/cache#DeletedFinalStateUnknown
Environment:
- kube-state-metrics version: 2.10.0
- Kubernetes version (use
kubectl version): v1.27.4 - Cloud provider or hardware configuration: GKE
- Other info:
Error log:
E0922 08:07:20.392009 1 runtime.go:79] Observed a panic: &runtime.TypeAssertionError{_interface:(*runtime._type)(0x188e6c0), concrete:(*runtime._type)(0x19602a0), asserted:(*runtime._type)(0x1af2fe0), missingMethod:""} (interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *unstructured.Unstructured)
goroutine 25 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x18ef280?, 0xc0018dc000})
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x0?})
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0x75
panic({0x18ef280, 0xc0018dc000})
/usr/local/go-1.20.7/src/runtime/panic.go:884 +0x213
k8s.io/kube-state-metrics/v2/internal/discovery.(*CRDiscoverer).StartDiscovery.func2({0x19602a0?, 0xc001cc0700?})
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/src/k8s.io/kube-state-metrics/internal/discovery/discovery.go:78 +0x495
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnDelete(...)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:257
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:978 +0xaf
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000230738?, {0x1d7cbc0, 0xc000732000}, 0x1, 0xc000730000)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x3b9aca00, 0x0, 0x0?, 0x0?)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:161
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0000df4d0)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:967 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:70 +0x85
panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *unstructured.Unstructured [recovered]
panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *unstructured.Unstructured
goroutine 25 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x0?})
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x18ef280, 0xc0018dc000})
/usr/local/go-1.20.7/src/runtime/panic.go:884 +0x213
k8s.io/kube-state-metrics/v2/internal/discovery.(*CRDiscoverer).StartDiscovery.func2({0x19602a0?, 0xc001cc0700?})
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/src/k8s.io/kube-state-metrics/internal/discovery/discovery.go:78 +0x495
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnDelete(...)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:257
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:978 +0xaf
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000230738?, {0x1d7cbc0, 0xc000732000}, 0x1, 0xc000730000)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x3b9aca00, 0x0, 0x0?, 0x0?)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:161
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0000df4d0)
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:967 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
/bitnami/blacksmith-sandox/kube-state-metrics-2.10.0/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:70 +0x85
QQ: have you enabled custom state resource metrics feature?
Ah, we have, however it's with the following settings
kind: CustomResourceStateMetrics
spec: {}
As nothing's been templated for it here
And also, apologies, I didn't check the logs far enough back before, with this it also logs the following regularly too
E0927 13:47:06.709384 1 reflector.go:148] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:231: Failed to watch apiextensions.k8s.io/v1, Resource=customresourcedefinitions: unknown
Which I'm assuming is related
So I guess this is either invalid config or an empty spec could be seen as a no-op?
Since you are not using CustomResourceStateMetrics feature spec: {}, could you disable this feature and try again?
Yes, have updated our templating to not include the --custom-resource-state-config flag when this is empty
Can confirm this resolves the log messages and KSM is once again unaffected by CRD changes
Okay. I guess the crash might be related to https://github.com/kubernetes/kube-state-metrics/issues/2202
Discussed with @logicalhan yesterday around moving CustomResourceStateMetrics feature out of KSM repo. This issue is one datapoint which supports this idea.
/triage accepted