secrets-provider-for-k8s
secrets-provider-for-k8s copied to clipboard
Secrets rotation improvements
Hello,
in our enterprise company, we have faced the urgency of password rotation. One of the main requirements was to be able to rotate credentials without service restarts (containers can read and apply the changed credentials on-the-fly). Also, the push-to-file solution did not fit as it requires a sidecar inside each Kubernetes pod. In the case where we are running hundreds of Kubernetes clusters, each with dozens of services, it creates significant resource overhead (besides the Istio service-mesh sidecar). Not to mention deployment configuration for every pod would be redundant.
We decided to make some secret-provider implementation modifications to fulfill the requirements for smoother password rotation. In the following sections, I will describe how the whole solution works and which modifications in the codebase were necessary. Also, some draft pull requests were made. If there is interest in any of it, we can proceed.
The Workflow
note: For password rotations, we encourage dual accounts.
- Conjur provider is deployed as a Deployment in conjur-namespace. A Kubernetes webhook is registered to react to every new or modified labeled Secret in the cluster.
- A new labeled secret is deployed. The mutation webhook is called.
- The Conjur provider webhook server processes the new event and triggers secret provision for the given Secret object.
- Retrieved Conjur secrets values (and group templates) are injected into the Kubernetes Secret.
- The Kubernetes Secret's key is mounted (projected) as a file into container filesystem.
- The mounted file is read during application service bootstrapping.
- The Kubernetes Secret is modified/redeployed.
- (Same as step 2)
- (Same as step 3)
- New Secret key values are automatically projected into the container filesystem.
- Application service runtime reloads modified filesystem file on-the-fly.
- Conjur-provider timer runs out. All labeled K8s Secrets are retrieved, all Conjur secrets are retrieved.
- New values and template groups are injected into appropriate Kubernetes Secrets.
- New Secrets mounts are projected into container filesystem.
- Every application service reloads its modified file.
Advantages:
- Only one instance of the secrets-provider needs to be deployed in the cluster
- Easy addition or removal of targeted Kubernetes Secrets via labeling
- Since the provider periodically checks and retrieves Conjur secrets, it is well-suited for password rotation scenarios
- Due to the Kubernetes webhook, every new or modified Kubernetes Secret object is provided immediately.
- When an application pod is running multiple instances that consume data from a single Kubernetes Secret, there's no need to run conjur-provider separately for each instance
- When secret data items are mounted into the container's filesystem as files, any changes in the Kubernetes Secret are automatically reflected in the mounted files within the container (this functionality depends on the application framework's ability to reload changed configuration files on-the-fly during runtime, eliminating the need to restart the application container)
- allowing both conjur-map and templates for K8S Secret
Implementation Changes
To run the full flow as described above, all changes should be implemented. However, some features might be useful as standalone. Also, draft pull requests for every feature area were prepared.
- Provider can retrieve K8s Secrets based on their labels. No need to hardcode K8s Secret names in provider configuration. Only secrets with appropriate labels set are handled.
PR: https://github.com/cyberark/secrets-provider-for-k8s/pull/550 - Provider can be deployed as a standalone Kubernetes pod and periodically run the provisioning process.
PR: https://github.com/cyberark/secrets-provider-for-k8s/pull/553 - Provider configures and implements K8s mutation webhook. When a new K8s Secret is deployed or an existing one is modified, the webhook triggers the Conjur provider to retrieve Conjur secret and immediately mutate the K8s secret.
PR: https://github.com/cyberark/secrets-provider-for-k8s/pull/553 - Improved batch Conjur secret retriever. If one variable in batch retrieve request fails, it is excluded and the batch request is repeated.
PR: https://github.com/cyberark/secrets-provider-for-k8s/pull/551 - Use file templates with k8s_secrets in the same way as in the push-to-file scenario. Then the Secret key may be mounted as file into container FS. PR: https://github.com/cyberark/secrets-provider-for-k8s/pull/552
- Provider operates on cluster-scope level. This means one running instance handles all secrets in the Kubernetes cluster. For every secret in every namespace, it uses the correct authenticator, retrieves variables from Conjur, and injects them into appropriate Kubernetes Secrets
PR: The solution may seem complex. Draft pull requests can be prepared if there is interest.
Hi @romanfurst,
Thank you so much for this great suite of improvements! We're also thrilled to see users adapting our solutions to fit their unique needs and are even more excited when they release those customizations back to the open source community!
Since this is fairly large set of changes, it will take us some time before we can give it the proper level of review needed to be able to determine whether we want to merge it, but rest assured that we will do exactly that - I just can't guarantee when it will happen. I'll keep you posted on our progress when we start on it.
Thank you!
@romanfurst I wanted to update you that we're looking at possibly incorporating some of these changes into our roadmap, though the timeline isn't clear yet.
@szh Thank you for the update. If more detailed information, an explanation some of the attached draft change request (or assistance in refining them into a more release-ready state), or any other contribution is gonna be required from me, please let me know.
Tracking internally as CNJR-9100.
Would love to see #552 merged.
@cdenney-silex thanks for the feedback - much appreciated. It's on our roadmap :)
This issue is stale because it has been inactive for 30 days. Please comment to keep it open. Otherwise, it will be automatically closed in 14 days.