config-syncer
config-syncer copied to clipboard
Kubed pod crashes with error - "unable to parse requirement: invalid label value"
When I was trying out what values can be accepted by kubed annotation, I mistakenly used kubed.appscode.com/sync="app=*-api"
. I observed kubed pod crashing with the error:
unable to parse requirement: invalid label value: "*-api": at key: "name": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')
at k8s.io/apimachinery/pkg/util/runtime.HandleCrash (runtime.go:55)
at panic (/usr/local/go/src/runtime/panic.go:969)
at k8s.io/apimachinery/pkg/util/runtime.Must (runtime.go:171)
at github.com/appscode/kubed/pkg/syncer.(*nsSyncer).OnAdd (resourcehandlers.go:152)
at k8s.io/client-go/tools/cache.(*processorListener).run.func1.1 (shared_informer.go:658)
at k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff (wait.go:292)
at k8s.io/client-go/tools/cache.(*processorListener).run.func1 (shared_informer.go:652)
at k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1 (wait.go:152)
at k8s.io/apimachinery/pkg/util/wait.JitterUntil (wait.go:153)
at k8s.io/apimachinery/pkg/util/wait.Until (wait.go:88)
at k8s.io/client-go/tools/cache.(*processorListener).run (shared_informer.go:650)
at k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1 (wait.go:71)
kubed pod refused to start until that wrong annotation was removed.
I'll bump this issue and document an additional failure case.
We define resources in source control using Kustomize and these resources have some dynamic properties that are replaced using Kustomize vars.
Due to a script bug, a secret was deployed using an un-replaced var in the kubed.appscode.com/sync
annotation. The resulting resource, that was actually deployed to k8s, looked something like this:
apiVersion: v1
kind: Secret
metadata:
annotations:
# OOPS! should have been kubernetes.io/metadata.name=build-9001
kubed.appscode.com/sync: kubernetes.io/metadata.name=build-$(BUILD_NUMBER)
data:
...
This than caused the kubed
pod to error in the same way as above, and go into CrashLoopBackOff
. Effectively torpedoing Kubed for the entire cluster.
My perspective on the remediation of this bug:
- While I could observe panics in the
kubed
pod logs, at no point was the actual secret name printed. I had to scrape annotations across all secrets to even find the offending resource, which increased downtime. At the very least, the name of the resource should be included in an error message. - This annotation was completely legal, as evidenced by the fact that the resource was admitted. Kubed should be able to handle all possible annotation values, and ignore/warn on any that don't parse.
- Probably the same point as above, but Kubed should implement a best-effort approach to config syncing. One bad resource annotation should not cause a cluster-wide cascading failure.
Hopefully this was helpful, thank you!