cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
Consider moving etcd to its own volume
/kind feature
Describe the solution you'd like Currently by default we only provision a single storage volume for instances. We should likely investigate creating a dedicated volume for etcd storage and verifying that the default configuration used is one that ensures both adequate and consistent performance characteristics.
Interestingly that would also enable the ability to use EBS volume encryption at rest without having to go through the rigmarole of someone rolling their own AMI.
Because this is only related to control-plane nodes, this should also be able to handle any image provided by image-builder, including ones that have been STIG partitioned, or with emptydir and containerd layers stored on a separate volume.
/assign
I believe this can add an additional field for infrav1.Instance for etcd volumes. It may be handy to allow for further customization in the future such as encrypted. I'm performing local tests with this configuration.
It might make sense for this to be part of CAP and not provider-specific. An optional etcd volume is beneficial for each of the cloud-providers and will require hooks in the build-process in order to set up the mounts/fstab entries automatically. The implementation of the volumes can remain platform specific, since the controller will be provisioning the EBS volume based on the AWS specific parameters in the createInstance() function.
I am testing the possibility of intercepting the userData creation by the AwsMachine controller. Because the default userData is stored in the KubeAdmConfig secret and is fetched by the machine.spec.bootstrap.dataSecretName, I think it would be possible to append given the case of using a customized etcd volume, but we would need to be prepended before kubeadm is called.
Linked: https://github.com/kubernetes/kubeadm/issues/2127
Workaround is to manually remove lost+found inside preKubeadmCommands
/priority important-longterm /milestone v0.5.x
https://github.com/kubernetes-sigs/cluster-api/issues/2994
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/lifecycle frozen
/triage accepted
/unassign @bagnaram
If you are working on this, feel free to assign it back to yourself.
With https://github.com/kubernetes-sigs/cluster-api/pull/3066/ merged in cluster-api, this use case is now possible, although it is not a CAPA default. To use a separate volume for etcd:
- Configure the additional volume in the AWSMachine (or AWSMachineTemplate used to create it) resource.
- Configure a mount of the volume in the KubeadmConfig resource.
/remove-lifecycle frozen
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.