cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
AWSMachineTemplate mutable fields have no path to being reconciled when modified
/kind bug
The AWSMachine object has a handful of mutable fields: additionalTags, additionalSecurityGroups.
The general guidance for deploying CAPI infrastructure is to use MachineDeployments for worker nodes. This makes sense, but in doing so access to these mutable fields is lost. This is for a few reasons:
- AWSMachineTemplate objects are fully immutable
- The MachineDeployment controller only speaks "replacement"
- The MachineSet controller has no logic for processing changes to the template
@sethp-nr and I are investigating a possible flow where the AWSMachine controller will manage this kind of change, we'll open a PR for discussion.
It's worth noting that, while we're going to try and keep this fix in CAPA land for now, there is a degree of access that only CAPI has - the MachineSet controller, for each MachineSet, knows what its InfrastructureTemplate looks like and the list of Machines it already owns.
Also, KubeadmControlPlane has a different field name for its infrastructure template reference. Note that this field is mutable, whereas the infra template reference on a MachineSet is not mutable, so we elect to ignore KCP for now.
/assign @rudoi @sethp-nr /lifecycle active
I'm wondering if we can simplify the implementation required for this a bit (and gain KCP support) by adding a field or annotation to the resources created from the template that is set during cloning to indicate what template it was cloned from. That way mapping a template to an instantiated resource can be done directly instead of trying to introspect through types.
If this field/annotation is made optional it's something that we could add into v1alpha3 as a non-breaking change.
I'm torn because with annotations you don't necessarily have to worry about the unstructured external reconciliation, but it may be hard to pack that metadata with like name, namespace, kind, etc.
With a field, even if optional it implies some kind of contract where minimally each provider would need to update their type to have a predictable field name somewhere in the Spec.
On the other hand, specific to CAPA (meaning no impact on other providers), we are definitely aware of:
- what fields are mutable
- what kinds of objects consume AWSMachineTemplates
So, it may feel clunky to look up MachineSets in the AWSMachine controller, for example, but at least the impact there has no bearing on other providers nor does it introduce any complexity into the upstream controllers' reconciliation loops.
I think we could simplify a lot if we could get away with adding a field to Machine. Something like "templatedFrom," and then tweak the MachineSet & KCP controllers to reconcile that field (though we'd have to either make an assumption in the KCP, or wait to do it on the next create). Either way, it's a more "invasive" change from CAPI's perspective.
The AWSMachine controller would still have to grow it dependency on the Machine contract, but there's some precedent for that in the Kubernetes version.
For my clarification, are you proposing that CAPA would see that an AWSMachine came from an AWSMachineTemplate, and then it would keep certain things from the template reconciled for each ec2 instance (tags, security groups)?
Yeah, it already knows how to reconcile tags + security groups from the AWSMachine to the ec2 instance. What we're imagining doing is giving a path to reconcile from the template down to the AWSMachine so that those fields would be managed at the template level instead of on a machine-by-machine basis.
/priority important-soon
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
I believe this issue is still looking for a fix. Right @rudoi ?
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/lifecycle frozen
needs upstream issue in CAPI.
/assign @randomvariable
/triage accepted
/remove-lifecycle frozen
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Deprioritize it with
/priority important-longtermor/priority backlog - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
This issue is currently awaiting triage.
If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.
The triage/accepted label can be added by org members by writing /triage accepted in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Hi all, it looks like there is CAPI limitation for AWSMachineTemplate to have mutable fields, right? Is it doable to workaround from CAPA or needs issue to CAPI? Could someone have some guidance? Thanks!
From office hours 3/6/23: This seems like a feature request, not a bug.
/kind feature
Not feasible given CAPI assumptions around machine templates.
/close
@dlipovetsky: Closing this issue.
In response to this:
Not feasible given CAPI assumptions around machine templates.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.