gardener-extension-provider-aws icon indicating copy to clipboard operation
gardener-extension-provider-aws copied to clipboard

Change the default capacity of volumes for `Etcd` from `80Gi` to `25Gi`.

Open renormalize opened this issue 10 months ago • 7 comments

How to categorize this PR?

/area cost /area storage /kind technical-debt /platform aws

What this PR does / why we need it:

This PR changes the capacity for volumes for Etcds from 80Gi to 25Gi.

  • Change the default capacity from 80Gi to 25Gi in charts/gardener-extension-provider-aws/values.yaml.

  • Change the default capacity from 80Gi to 25Gi in example/00-componentconfig.yaml.

  • Change tests to check if *etcd.Spec.StorageCapacity equals 25Gi instead of 80Gi.

  • Run make generate to generate the new chart in example/controller-registration.yaml.

Volumes of size 80Gi are unnecessary since moving from gp2 to gp3. See #932 for more.

Which issue(s) this PR fixes: Fixes #932

Special notes for your reviewer:

/invite @shreyas-s-rao

Release note:

New `Etcd` `gp3` volumes are now created with `25Gi` capacity instead of `80Gi` to save on storage costs.

renormalize avatar Apr 24 '24 11:04 renormalize

/hold We will need to test how druid would react to a change in Etcd Spec.StorageCapacity, since currently it does not automatically change the PVC template for the etcd statefulset (forbidden change).

shreyas-s-rao avatar Apr 25 '24 06:04 shreyas-s-rao

@shreyas-s-rao As discussed, I've added checks to see if the older Etcd has an 80Gi volume. If it does, resize to the new default 25Gi is not performed. For any other value, regular reconciliation occurs. Please let me know if you have any suggestions.

renormalize avatar Apr 28 '24 16:04 renormalize

@renormalize thanks for making the change. I would suggest that you simply check whether the oldEtcd is nil (ie, it's a new etcd), or if the oldEtcd.Spec.StorageCapacity is not set, then set it to 25Gi. Or else, don't mutate it. In other words, we want to set 25Gi only for new etcds, since right now druid does not allow changing storageCapacity for existing etcds yet.

shreyas-s-rao avatar Apr 28 '24 16:04 shreyas-s-rao

I've tested the changes on an extension-provider setup through @shreyas-s-rao's guidance.

I've done the following to test the changes:

  • Spawn the setup with gardener/gardener on master, with the latest release of gardener-extension-provider-aws.
  • Create 2 shoots, and hibernate one of them. Etcd volumes created with size 80Gi.
  • Change the chart and image of the controllerdeployment of provider-aws to chart generated in this PR, and the image that is built from this PR.
  • Check the volume sizes of the Etcds for the two spawned shoots. They did not change.
  • Create a new shoot. Etcd volume is created with size 25Gi.
  • Wake the hibernated shoot. Volume size did not change.
  • Reconcile all three shoots. Volume size did not change.
  • Change the capacity in the Etcd CR, to some value other than the defined value.
    • Old Etcds remain at 80Gi when changed to some other value.
    • New Etcds remain at 25Gi when changed to some other value.

Thus:

  • [x] We've validated that older Etcds of running shoots do not get resized due to the changes, size remains at 80Gi.
  • [x] We've validated that hibernated shoots which are woken up after the changes from this PR are deployed behave as expected, just like shoots which are already running with 80Gi.
  • [x] New shoots that are created get deployed with volumes of size 25Gi.

/ping @gardener/gardener-extension-provider-aws-maintainers

renormalize avatar May 17 '24 12:05 renormalize

@gardener/gardener-extension-provider-aws-maintainers ℹ️ please take some time to help renormalize or redirect to someone else if you can't.

gardener-robot avatar May 17 '24 12:05 gardener-robot

/ping @gardener/gardener-extension-provider-aws-maintainers Could you take a look, since this PR directly affects storage costs?

renormalize avatar May 23 '24 08:05 renormalize

@gardener/gardener-extension-provider-aws-maintainers ℹ️ please take some time to help renormalize or redirect to someone else if you can't.

gardener-robot avatar May 23 '24 08:05 gardener-robot