secrets-store-csi-driver Don't rely on pod creation to mount to Kubernetes secrets

Describe the solution you'd like

Currently, when you choose to sync to Kubernetes secrets, pulling secrets/certs/keys from Key vault always happen on pod creation and you also need to mount SecretProviderClass to specific pod to make it work.

If we have issues with Key Vault access (managed identity is not working, things are deleted accidently in Key Vault or Key Vault API is unavailable), things will start to fail when pod gets restart/recreated. And it's even worse for cronjobs, it will try to retrieve secrets/certs/keys from Key Vault every time a new pod is created under this cronjob, but as Key Vault is down, you can no longer create ANY new pod in production.

I hope this library can simply sync secrets/certs/keys to Kubernetes secrets at deploy time and don't need to rely on any pod to mount them, thus I don't have to worry about Key Vault availability after deployment.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

Secrets Store CSI Driver version: (use the image tag):
Kubernetes version: (use kubectl version):

Aug 20 '20 08:08 djj0809

@djj0809 Thank you for opening the issue! The scope of this project is to primarily mount the external secrets-store contents in the pod using the Kubernetes CSI standard. The sync as K8s secret is an optional feature to mirror those contents as Kubernetes secrets in addition to the mount. I think skipping the pod mount is a diversion from the actual scope of this driver implementation.

Sep 10 '20 16:09 aramase

@aramase one potential application of such use case is configuring ingress resource. Any thoughts on that?

Sep 14 '20 11:09 tdihp

@aramase one potential application of such use case is configuring ingress resource. Any thoughts on that?

@tdihp That makes sense. The consumption with ingress resource was one of the driving factors for optionally supporting mirroring the mounted content as Kubernetes Secrets. However if we skip the mount completely and only created Kubernetes Secret then that would be a diversion from the actual scope of the driver project. The CSI driver uses the CSI implementation as the core to mount secret contents for safe consumption within pods.

Oct 06 '20 22:10 aramase

I think this is a reasonable suggestion given the fact that the main use case for the secrets-store-csi-driver is the integration with some kind of vault.

There are a lot of resources that rely on native kubernetes secrets. Let's say that I have 100+ resources depending on a native kubernetes secret. Right now, that would mean that I need to create a SecretProviderClass, then add relatively useless volumeMounts and volumes for those 100+ resources (useless in the sense that the resources will not use the mount itself but the reference to the kubernetes secret). If I do not add that huge amount of boiler plate code for some resources, there could be a risk where no previous resources tried to mount it and the secret itself is unavailable.

If that feature were to be implemented, I would instead only need to create the SecretProviderClass, and reference the kubernetes secret in the resources that need it. This would mean a lot less overhead and cleaner yaml.

I think this could be a big improvement for all the vault integrations that depend on this project, unless there is already some way to get around this with the current setup that I missed.

Nov 09 '20 20:11 jemag

I agree with jemag on this -- not being able to sync any of the secrets from a vault into a secret means effectively having to manage secrets in two ways for most systems, because pretty much every Helm chart and 3rd party component manages its pods via an Operator or other mechanism, and you can't control how they access secrets. They just expect them to be present.

That leaves administrators having to choose between not using a vault implementation, managing secrets two ways, or using a hacky process of priming the system periodically with a pod to get things synced as a secret. We settled on the latter for now, but its definitely tricky to manage -- the CSI configuration and that pod configuration have to be kept in sync, and if someone misses something, its not always obvious.

Given that the CSI mechanism wakes up when a SecretProviderClass is created or updated, and it knows what secrets are being requested to be synced, it seems like its the best place to create a transient pod to trigger the synchronization.

Nov 19 '20 14:11 ghost

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

Feb 17 '21 14:02 fejta-bot

/remove-lifecycle stale

Feb 17 '21 18:02 aramase

+1 on this, will provide a detailed use case shortly.

Feb 18 '21 16:02 sbose78

Another use case is image pull secrets; there is a chicken and egg problem if creating the secret to pull the pod's image requires the pod to be running.

May 13 '21 02:05 cbroglie

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Aug 11 '21 03:08 k8s-triage-robot

/remove-lifecycle stale

Aug 11 '21 15:08 aramase

Another use case is where we are trying to create a PV and PVC using azure flexvolume blobfuse driver which needs a secret (type: azure/blobfuse) to be available in cluster to access the storage account. This secret need to be created by picking up the storage account access key from key vault but there is no need to mount this secret in pod.

Note: There is another issue that secrets-store-csi-driver is not supporting the "azure/blobfuse" type of secret creation.

Aug 13 '21 15:08 ddskit

Another quick feedback: I also find very cumbersome the necessity to start a pod in order for a secret to be created. I would say even counter intuitive when looking at this project for the first time, although I understand the initial scope of the driver was different.

As a user, I was hoping for a solution that easily bridges GCP Secret Manager and Kubernetes secrets, without affecting my existing workloads definitions.

Aug 18 '21 12:08 aaaaahaaaaa

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 24 '21 19:11 k8s-triage-robot

/remove-lifecycle stale

Nov 24 '21 19:11 nilekhc

I opened a related issue to this on the Azure provider driver. We're not seeing the same behavior where cronjobs pods check for new values before each run. They're still using the cached values because the secrets still exist (the completed pods are not cleaned up yet).

But because the pods are completed/succeeded, the rotation interval does not update the values. In the end, this means jobs start failing because the secret values are wrong.

In the end, I think these lines need to be changed to rotate secrets for completed or failed jobs. It's possible secrets update between runs, the pods still exist, and new mounts will use new values. And it's possible the jobs/pods fail because of bad secret values.

Maybe changing this behavior will also solve the initial ask from this. The secrets won't be made on definition, still on pod creation, but they will be kept up to date for future runs. Then if the vault access fails, the most recent secret should still be okay to go.

https://github.com/kubernetes-sigs/secrets-store-csi-driver/blob/0f4ff7d43e591dbc7d3b87ed1d7602a5f15f1059/pkg/rotation/reconciler.go#L267-L270

I'm tagging @aramase because he made the initial commit to exclude those from rotation.

Dec 03 '21 17:12 nerddtvg

@djj0809: From the above research, it is possible you're pruning out your completed/failed pods? When those are removed, the secrets are removed prompting for refresh on the next run. Check the successfulJobsHistoryLimit and failedJobsHistoryLimit values.

Dec 03 '21 17:12 nerddtvg

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Mar 03 '22 18:03 k8s-triage-robot

/remove-lifecycle stale

Mar 03 '22 18:03 aramase

Any update regarding this issue?

Apr 13 '22 20:04 Shaked

We are also looking for a solution to this issue. especially agreeing with @jemag

Jul 07 '22 04:07 leelakrishnachava

I think for some use cases described in the comments External Secrets Operator (https://external-secrets.io) could be a reasonable alternative. AFAIK it can just sync external secret with k8s native secret.

Jul 30 '22 04:07 r0bj

Although External Secrets Operator can solve this particular problem, would be nice if we could avoid having two controllers solving similar problems.

Aug 25 '22 23:08 gobins

another alternative, which i believe most of you have seen https://akv2k8s.io/ ( https://github.com/SparebankenVest/azure-key-vault-to-kubernetes )

Oct 11 '22 09:10 asitha99x

There's another use-case in knative and likely other similar tools that do not support custom volume types https://github.com/knative/serving/issues/11069 Syncing the secret only then referencing it 'natively' would enable any use case.

Jan 06 '23 09:01 amccague

Are there any potential issues with using a gcr.io/google_containers/pause pod to define the mount then have other pods reference the secret?

Jan 06 '23 09:01 amccague

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Apr 06 '23 10:04 k8s-triage-robot

/remove-lifecycle stale

Apr 06 '23 12:04 pitinga

/lifecycle frozen

Marking this as frozen so it doesn't go to stale. We're exploring the option of decoupling the sync as k8s secret feature from the CSI driver mount.

May 09 '23 22:05 aramase

@aramase may I know if there is any progress regarding this? Thanks🙇

==================

Found some update from the meeting notes. If there is ETA about the feature, it would be great

Nov 08 '23 13:11 Shuanglu

secrets-store-csi-driver secrets-store-csi-driver copied to clipboard

Don't rely on pod creation to mount to Kubernetes secrets

secrets-store-csi-driver
secrets-store-csi-driver copied to clipboard