cluster-api-provider-aws icon indicating copy to clipboard operation
cluster-api-provider-aws copied to clipboard

AWSMachineTemplate mutable fields have no path to being reconciled when modified

Open rudoi opened this issue 5 years ago • 17 comments

/kind bug

The AWSMachine object has a handful of mutable fields: additionalTags, additionalSecurityGroups.

The general guidance for deploying CAPI infrastructure is to use MachineDeployments for worker nodes. This makes sense, but in doing so access to these mutable fields is lost. This is for a few reasons:

  1. AWSMachineTemplate objects are fully immutable
  2. The MachineDeployment controller only speaks "replacement"
  3. The MachineSet controller has no logic for processing changes to the template

@sethp-nr and I are investigating a possible flow where the AWSMachine controller will manage this kind of change, we'll open a PR for discussion.

It's worth noting that, while we're going to try and keep this fix in CAPA land for now, there is a degree of access that only CAPI has - the MachineSet controller, for each MachineSet, knows what its InfrastructureTemplate looks like and the list of Machines it already owns.

Also, KubeadmControlPlane has a different field name for its infrastructure template reference. Note that this field is mutable, whereas the infra template reference on a MachineSet is not mutable, so we elect to ignore KCP for now.

/assign @rudoi @sethp-nr /lifecycle active

rudoi avatar May 04 '20 18:05 rudoi

I'm wondering if we can simplify the implementation required for this a bit (and gain KCP support) by adding a field or annotation to the resources created from the template that is set during cloning to indicate what template it was cloned from. That way mapping a template to an instantiated resource can be done directly instead of trying to introspect through types.

If this field/annotation is made optional it's something that we could add into v1alpha3 as a non-breaking change.

detiber avatar May 04 '20 19:05 detiber

I'm torn because with annotations you don't necessarily have to worry about the unstructured external reconciliation, but it may be hard to pack that metadata with like name, namespace, kind, etc.

With a field, even if optional it implies some kind of contract where minimally each provider would need to update their type to have a predictable field name somewhere in the Spec.

On the other hand, specific to CAPA (meaning no impact on other providers), we are definitely aware of:

  1. what fields are mutable
  2. what kinds of objects consume AWSMachineTemplates

So, it may feel clunky to look up MachineSets in the AWSMachine controller, for example, but at least the impact there has no bearing on other providers nor does it introduce any complexity into the upstream controllers' reconciliation loops.

rudoi avatar May 04 '20 20:05 rudoi

I think we could simplify a lot if we could get away with adding a field to Machine. Something like "templatedFrom," and then tweak the MachineSet & KCP controllers to reconcile that field (though we'd have to either make an assumption in the KCP, or wait to do it on the next create). Either way, it's a more "invasive" change from CAPI's perspective.

The AWSMachine controller would still have to grow it dependency on the Machine contract, but there's some precedent for that in the Kubernetes version.

sethp-nr avatar May 05 '20 16:05 sethp-nr

For my clarification, are you proposing that CAPA would see that an AWSMachine came from an AWSMachineTemplate, and then it would keep certain things from the template reconciled for each ec2 instance (tags, security groups)?

ncdc avatar May 13 '20 17:05 ncdc

Yeah, it already knows how to reconcile tags + security groups from the AWSMachine to the ec2 instance. What we're imagining doing is giving a path to reconcile from the template down to the AWSMachine so that those fields would be managed at the template level instead of on a machine-by-machine basis.

sethp-nr avatar May 15 '20 21:05 sethp-nr

/priority important-soon

detiber avatar May 18 '20 17:05 detiber

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Nov 12 '20 13:11 fejta-bot

I believe this issue is still looking for a fix. Right @rudoi ?

detro avatar Nov 14 '20 17:11 detro

/remove-lifecycle stale

detro avatar Nov 14 '20 17:11 detro

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar Feb 12 '21 17:02 fejta-bot

/lifecycle frozen

randomvariable avatar Mar 11 '21 18:03 randomvariable

needs upstream issue in CAPI.

randomvariable avatar Mar 11 '21 18:03 randomvariable

/assign @randomvariable

sedefsavas avatar Nov 01 '21 17:11 sedefsavas

/triage accepted

randomvariable avatar Nov 01 '21 17:11 randomvariable

/remove-lifecycle frozen

richardcase avatar Jul 08 '22 22:07 richardcase

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 06 '22 23:10 k8s-triage-robot

/remove-lifecycle stale

richardcase avatar Oct 10 '22 10:10 richardcase

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged. Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Deprioritize it with /priority important-longterm or /priority backlog
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Feb 08 '23 17:02 k8s-triage-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 08 '23 17:02 k8s-triage-robot

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 08 '23 17:02 k8s-ci-robot

Hi all, it looks like there is CAPI limitation for AWSMachineTemplate to have mutable fields, right? Is it doable to workaround from CAPA or needs issue to CAPI? Could someone have some guidance? Thanks!

wyike avatar Feb 13 '23 03:02 wyike

From office hours 3/6/23: This seems like a feature request, not a bug.

/kind feature

dlipovetsky avatar Mar 06 '23 17:03 dlipovetsky

Not feasible given CAPI assumptions around machine templates.

/close

dlipovetsky avatar Mar 06 '23 17:03 dlipovetsky

@dlipovetsky: Closing this issue.

In response to this:

Not feasible given CAPI assumptions around machine templates.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 06 '23 17:03 k8s-ci-robot