gardener icon indicating copy to clipboard operation
gardener copied to clipboard

Make the SecretBinding resource provider specific

Open ialidzhikov opened this issue 4 years ago • 14 comments

How to categorize this issue?

/area scalability /kind enhancement /priority 3

Summary

This proposal aims to make the SecretBinding (and referenced Secret) resource provider specific to allow provider type information of the SecretBinding (or referenced Secret) to be accessed easily by external components.

Motivation

Today, the Shoot resource references a cloud provider Secret using the SecretBinding resource. Although that the SecretBinding resource points to a Secret that usually belongs to a given cloud provider (the Secret data contains required fields as per the documentation of the provider extension), the provider type information is not present (or maintained) on the SecretBinding or on the Secret resource. To determine the provider type of a cloud provider Secret, we have to:

  1. Select all SecretBindings that reference our Secret (there might be also SecretBindings from other Namespaces - the API allows SecretBinding to reference a Secret from other Namespace)
  2. Select all Shoots that are using SecretBindings from step 1)
  3. Map the selected Shoots from step 2) to provider type(s)

As you can see, the whole operation involves several steps. The operation is not a constant (for example data that you can read from the resource) but includes requests/watches and processing afterwards.

Goals

  • Make the SecretBinding resource provider specific (add a new field .provider.type) to allow the provider type information of the SecretBinding to be accessed easily.
    • The provider extension admission components will benefit from this as they will be now able to add a webhook that validates SecretBinding creation - whether the referenced Secret contains the required fields.
    • The dashboard component can benefit from this when showing the Secrets. It currently relies on a cloudprofile.garden.sapcloud.io/name custom annotation and this can be improved - see https://github.com/gardener/dashboard/issues/781.
  • Once the SecretBinding resource has the provider type information, maintain the provider type as a label (for example provider.shoot.gardener.cloud/<type>=true) in the referenced Secret. In this we also make the cloud provider Secret resource to have the provider type information.
    • For example provider extension admission components will benefit from this as they will be now able to select only Secrets matching given provider type. Currently admission components perform the steps from the above section to determine the provider type of the Secret.

Proposal

The main challenge with this proposal is the backwards compatibility. We cannot add a new required field .provider.type to the SecretBinding resource because such new required field would break the API compatibility - existing clients won't be allowed to create/update SecretBindings. Adding the new fields as optional is not enough on its own. We want to make use of the new field and we cannot rely that API users will update their handling somewhen in the future. We want to populate the SecretBinding provider type field based on its current usage. We want also to make this field required in the future.

Here are the proposed steps:

  1. Add new optional provider.type field to the SecretBinding resource.
apiVersion: core.gardener.cloud/v1beta1
kind: SecretBinding
# ...
provider:
  type: <some-provider-name> # {aws,azure,gcp,...}
# ...

Today, the API allows the SecretBinding to reference a Secret that is actually used by multiple provider types. Although that the API allows it, I don't think it is a good practice to reuse single Secret to hold the data for different cloud providers. If we decide to make the field []string (.provider.types) then we will hint the API users that they can use single multi-type Secret (which is something that we don't recommend). To do not break an existing setup with such multi-type Secret, the SecretBinding can actually maintain the provider types in comma-separated form - for example .provider.type="aws,gcp".

  1. Populate the SecretBinding provider type based on its current usage.

We can add a new controller in GCM that for each Shoot will maintain the Shoot provider.type to the SecretBinding (if it is not already present). The controller can be behind a feature gate (for example SecretBindingProviderPopulator) to allow controlled feature enablement for Gardener operators.

  1. Make the SecretBinding provider type required

After we make sure that most of the SecretBindings have already provider.type set, we can forbid:

  • creating new SecretBinding with empty provider and
  • creating a Shoow with SecretBinding that does not match the Shoot provider type (including the case where the SecretBinding provider is empty)

Again, we can control the enablement of this handling with feature gate on Gardener API server side (for example RequireSecretBindingProvider).

Alternatives

Instead of adding new optional field to the existing API and then working on making this new field required, we can create a new API version (core.gardener.cloud/v1beta2) and have the field required there. But introducing a new API version usually takes much more time and effort.


Tasks

  • [x] Introduce a new optional field (provider.type) for the SecretBinding resource and implement the proposal from this issue -> https://github.com/gardener/gardener/pull/5058
  • [x] Adapt the Gardener dashboard to set the new field on SecretBinding creation (and to potentially use it while displaying the infrastructure Secrets by providers) -> https://github.com/gardener/dashboard/issues/1147
  • [x] Adapt other components that create SecretBindings (internal or external ones) to specify the new provider.type field
  • [x] Enable the SecretBinding provider controller by default -> https://github.com/gardener/gardener/pull/5499
  • [x] Enable the SecretBindingProviderValidation feature gate and disable the SecretBinding provider controller -> https://github.com/gardener/gardener/pull/6240
  • [ ] Graduate the SecretBindingProviderValidation feature gate -> https://github.com/gardener/gardener/pull/6429
  • [ ] Clean up the SecretBindingProviderValidation feature gate and the SecretBinding provider controller in GCM

ialidzhikov avatar Oct 21 '21 15:10 ialidzhikov

Generally I'm fine with the proposal. One thought I had was that there were ideas to reuse SecretBinding resources also for other secret references in the Shoot API. For example, the DNS secrets (https://github.com/gardener/gardener/blob/65288a503da25ef2d89a0d019b4f19f78606986f/example/90-shoot.yaml#L259-L265) are directly referenced today. Instead, we we could allow referencing a SecretBinding which would make it possible to reuse the same DNS secret in other projects.

How would this work with your ideas? I'm not even sure if this is something we would like to support since it results in additional work and complexity on our side while it has little benefits. Back then, the main motivation for having SecretBindings for cloud provider secrets was the "shoot quota / trial" feature. Without it, I doubt that we had ever introduced it.

rfranzke avatar Oct 22 '21 06:10 rfranzke

FYI: https://github.com/gardener/gardener/issues/3846 is what I meant above

rfranzke avatar Oct 22 '21 08:10 rfranzke

FYI: #3846 is what I meant above

Thanks for sharing this!

Obviously Shoot provider type (aws, gcp, ...) is different from DNS provider type (aws-route53, google-clouddns, ...). If we allow SecretBindings to be used for DNS, then I think the provider type field still makes sense:

  • to prevent configuring a DNS provider with provider type foo in the Shoot spec that refers to SecretBinding with provider type bar - it will be now possible to add such check
  • to potentially have an admission webhook on SecretBindings with the given DNS provider type that checks for the required fields of the referenced Secret

The only drawback I see is that this approach will always require the DNS SecretBinding to be different from the Shoot SecretBinding - or said with other words it won't be possible to reuse 1 SecretBinding for both. I think today is possible to use single Secret for Shoot (via SecretBinding) and DNS.

On the other side my motivation to make the SecretBinding provider type field a string (and not []string) is to prevent usages of single SecretBinding for multiple cloud provider types - I think this is not something we recommend. I guess it won't be good to mix up Shoot and DNS provider types in the SecretBinding spec.

Bottom line: I guess this proposal can fit with the plans in #3846 if we agree that reuse of 1 SecretBinding for Shoot and DNS is not supported. For example:

apiVersion: core.gardener.cloud/v1beta1
kind: SecretBinding
metadata:
  name: shoot-secret-binding
  namespace: garden-dev
# ...
provider:
  type: aws
apiVersion: core.gardener.cloud/v1beta1
kind: SecretBinding
metadata:
  name: dns-secret-binding
  namespace: garden-dev
# ...
provider:
  type: aws-route53

Actually both SecretBindings can even reference the very same Secret at the end.

Let me also cc @timuthy as author of #3846.

WDYT?

ialidzhikov avatar Oct 25 '21 15:10 ialidzhikov

Sounds good to me, thank you!

rfranzke avatar Oct 26 '21 06:10 rfranzke

Thanks for the detailed proposals @ialidzhikov. I like it!

One question as an additional consideration: If we would go for multiple provider types, i.e. SecretBinding.provider.types[], could we still use a field-selector like provider.types==aws to optimize our watches and caches? Is that supported by the API server (library)?

timebertt avatar Oct 26 '21 06:10 timebertt

Sounds good, thanks @ialidzhikov

Some additional thoughts:

  • provider.type must probably become immutable.

  • Obviously Shoot provider type (aws, gcp, ...) is different from DNS provider type

    For convenience and better user-experience we can think about an internal mapping or raise a requirement for the DNS extensions to also support simple provider names like aws, gcp, etc.

  • We can default the provider type information in the shoot based on the referred SecretBindings via admission.

timuthy avatar Oct 26 '21 07:10 timuthy

If we would go for multiple provider types, i.e. SecretBinding.provider.types[], could we still use a field-selector like provider.types==aws to optimize our watches and caches? Is that supported by the API server (library)?

In the ToSelectableFields func we could join the provider types with a separator (","). https://github.com/gardener/gardener/blob/5095f22a8797327aaf65fc6061c951b14dadf742/pkg/registry/core/shoot/strategy.go#L175-L187

But my assumption is that if a SecretBinding has multiple providers, then you will be able to select the obj via field selector only if you provide all types (but not only a single type). Actually the Pod resource has a field selector for status.podIPs. But to be honest I am not able to make it work. The field selector topic sounds like a yet another reason to go for the single provider approach.


provider.type must probably become immutable.

I can agree with it. But it is not possible to make a newly introduced field (which is optional) required and immutable right away. In the above proposal we could enforce the immutability after step 3.

We can default the provider type information in the shoot based on the referred SecretBindings via admission.

In the above proposal I used a controller in GCM. Do you rather propose that admission plugin will be the more appropriate approach. If yes, do you have in mind pros of the admission plugin approach over the controller one?

ialidzhikov avatar Oct 26 '21 08:10 ialidzhikov

Thanks for checking the details, @ialidzhikov 👍

Actually the Pod resource has a field selector for status.podIPs. But to be honest I am not able to make it work.

From a quick look, it seems like there is only a field path defined for status.podIPs (probably for usage in downward API), but it's not a selectable field: https://github.com/kubernetes/kubernetes/blob/45ce2dfacc87297f69d18245d5255e6e183d63d1/pkg/registry/core/pod/strategy.go#L317-L322 From what I saw in the kube libraries, it doesn't seem to be possible to use a single-valued field-selector for a multi-valued field.

The field selector topic sounds like a yet another reason to go for the single provider approach.

Yes, I agree. If the field selector doesn't work for the providers array, I would definitely vote for having a single provider, to make full use of the benefits of having the type in there.

timebertt avatar Oct 26 '21 09:10 timebertt

But it is not possible to make a newly introduced field (which is optional) required and immutable right away. In the above proposal we could enforce the immutability after step 3.

Hm why not? We could only allow updates if the previous value is empty, or am I missing something?

In the above proposal I used a controller in GCM. Do you rather propose that admission plugin will be the more appropriate approach. If yes, do you have in mind pros of the admission plugin approach over the controller one?

Hm, but this controller is only supposed to maintain the spec.type of the SecretBinding, right? My proposal was related to a shoot creation. Image you created a handful of SecretBindings with correct spec.type and then you create Shoot(s) that refer to those SecretBindings. Then again, the user needs to specify every type information that they already configured in the SecretBindings. So for a better user experience, it would be nice if an admission plugin can fill-out the type information from the referred SecretBindings. Why a admission controller? Because otherwise gardenlet needed to wait until the type information is filled.

timuthy avatar Oct 26 '21 11:10 timuthy

Hm why not? We could only allow updates if the previous value is empty, or am I missing something?

To preserve backwards compatibility for SecretBindings that are used for multiple providers the proposal allows multi-values for the provider.type field - for example provider.type=aws,gcp. There is actually the following case - if you have an AWS Shoot that is using SecretBinding foo, GCM will update foo with provider.type=aws. If after 1 week you create a new Shoot with provider GCP and SecretBinding foo, then GCM will update foo with provider.type=aws,gcp. I still have to clarify when and how the SecretBinding provider populator controller gets disabled. After we make sure that the process of setting the provider type is complete and all in-use SecretBindings have their provider.type set, then we can enforce the immutability check on this field.

Hm, but this controller is only supposed to maintain the spec.type of the SecretBinding, right? My proposal was related to a shoot creation. Image you created a handful of SecretBindings with correct spec.type and then you create Shoot(s) that refer to those SecretBindings. Then again, the user needs to specify every type information that they already configured in the SecretBindings. So for a better user experience, it would be nice if an admission plugin can fill-out the type information from the referred SecretBindings. Why a admission controller? Because otherwise gardenlet needed to wait until the type information is filled.

My first understanding was that you suggest this for the SecretBinding resource. Thanks for clarifying. The idea can make sense but, if you agree, this rather seems to be out of the scope for this proposal. This proposal has the focus on:

  • the API changes for the new field
  • the migration process of "populating" the SecretBinding provider type
  • the process of enforcing validations on top

ialidzhikov avatar Oct 27 '21 10:10 ialidzhikov

Thanks for clarifying the points above @ialidzhikov.

The idea can make sense but, if you agree, this rather seems to be out of the scope for this proposal.

I got your point. OTOH, this is just an issue where we should also be willing to collect non-critical requirements that will otherwise be forgotten.

timuthy avatar Oct 28 '21 07:10 timuthy

The Gardener project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close

/lifecycle stale

gardener-ci-robot avatar Jun 09 '22 15:06 gardener-ci-robot

/remove-lifecycle stale

timebertt avatar Jun 10 '22 05:06 timebertt

/remove-lifecycle frozen

rfranzke avatar Jul 15 '22 08:07 rfranzke

We can close now as all sub-tasks are completed.

/close

ialidzhikov avatar Aug 30 '22 06:08 ialidzhikov

@ialidzhikov: Closing this issue.

In response to this:

We can close now as all sub-tasks are completed.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

gardener-prow[bot] avatar Aug 30 '22 06:08 gardener-prow[bot]