azure-key-vault-to-kubernetes
azure-key-vault-to-kubernetes copied to clipboard
[BUG] Regression in Env Injector breaks postgres deployments managed by CloudNativePg
Components and versions
[X] Env-Injector (webhook), version: 1.5.0
[X] Helm Release (2.5.0)
Describe the bug The latest release (1.5.0) of the env injector causes a problem when running postgres clusters with the Cloudnative Postgres Operator. These postgres clusters work fine when using the previous release (1.4.0) which was installed with Helm Release (2.4.2)
To Reproduce
- Install the CloudNative Postgres Operator using helm version 0.18.2 with the default values.
- Setup a namespace "postgres-test"
apiVersion: v1
kind: Namespace
metadata:
name: postgres-test
labels:
azure-key-vault-env-injection: enabled
- then setup a postgres cluster in the postgres-test namespace
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:13.12-6
instances: 1
storage:
size: 1Gi
- The CloudNative Operator will create a pod (postgres-1-initdb-......) which does not start with the following error message
Error: container has runAsNonRoot and image has non-numeric user (nonroot), cannot verify user is non-root (pod: "postgres-1-initdb-5pq6x_bynubian-dev-02(edf17686-61ad-4165-b280-2a19c9400eda)", container: bootstrap-controller)
Expected behavior The pods started by the CloudNative Operator should start.
Additional context Reverting the akv2k8s helm chart back to 2.4.2 which also reverts the env injector to 1.4.0 fixes the issue and postgres deployments are possible again
We were having exactly the same problem and it took a lot of time to figure out why the security context of the pods was empty even though it was configured in the deployment and replicaset 😅 Downgrading fixed the issue for now, but I hope it can be addressed in a future release.
Hey guys,
Think i found the issue: https://github.com/SparebankenVest/azure-key-vault-to-kubernetes/commit/1e464adf8ca1777e2037801728d8d002cace78f1
podSpec.SecurityContext = &corev1.PodSecurityContext{
RunAsNonRoot: &[]bool{viper.GetBool("webhook_pod_spec_security_context_non_root")}[0],
}
Iv updated that to be:
if viper.GetBool("webhook_pod_spec_security_context_non_root") {
podSpec.SecurityContext.RunAsNonRoot = &[]bool{viper.GetBool("webhook_pod_spec_security_context_non_root")}[0]
}
This respects the original pod securityContext unless we force webhook_pod_spec_security_context_non_root
to true
where we edit that value only.
I've posted a message on the dev slack to try see if i can get a dev environment going for this, but if anyone is interested:
diff --git a/cmd/azure-keyvault-secrets-webhook/pod.go b/cmd/azure-keyvault-secrets-webhook/pod.go
index f94ef5b..cf313b6 100644
--- a/cmd/azure-keyvault-secrets-webhook/pod.go
+++ b/cmd/azure-keyvault-secrets-webhook/pod.go
@@ -278,8 +278,8 @@ func (p podWebHook) mutatePodSpec(ctx context.Context, pod *corev1.Pod) error {
var authServiceSecret *corev1.Secret
var err error
podSpec := &pod.Spec
- podSpec.SecurityContext = &corev1.PodSecurityContext{
- RunAsNonRoot: &[]bool{viper.GetBool("webhook_pod_spec_security_context_non_root")}[0],
+ if viper.GetBool("webhook_pod_spec_security_context_non_root") {
+ podSpec.SecurityContext.RunAsNonRoot = &[]bool{viper.GetBool("webhook_pod_spec_security_context_non_root")}[0]
}
if p.useAuthService {
to make a new image:
make build images
This is broken for us as well and keeping us from being able to move to workload identity. Could we get some attention on this?