eksctl icon indicating copy to clipboard operation
eksctl copied to clipboard

eksctl apply: cluster reconciliation

Open michaelbeaumont opened this issue 4 years ago • 26 comments

Why do you want this feature? This is the umbrella issue concerning eksctl apply, which would support eksctl apply -f config.yaml to reconcile the current cluster with the given config (initially partially). Also sometimes known as eksctl update.

Discussion

Give us your ideas and use cases in the github discussion!

Related issues:

#1497 #462 (previous umbrella issue, closed because the history is confusing) #20 https://github.com/weaveworks/eksctl/issues/583 (discussion around storing metadata)

michaelbeaumont avatar Oct 27 '20 15:10 michaelbeaumont

🎉

This issue is going to track the delivery of a proposal for how we will implement gitops-style reconciliation with eksctl.

Deliverable is a proposal in docs/ which details:

  1. The end state of eksctl apply (ie. ignoring everything else which currently exists, what would it look like if we could apply today?)
  2. What the rest of the eksctl UX looks like in a world with apply (ie. which flags, subcommands, etc would survive?)
  3. An implementation strategy: how we adapt the code to get from now to apply (design diagrams may be handy)
  4. A consideration of the risks and anything which may stop us from truly representing a cluster in config

Doing gitops reconciliation has been the goal of this project for a very long time, but the rollout was so gradual that the Plan was eventually forgotten as people moved off the project. We need to move incrementally, but let's aim to not be too incremental so that we forget the big picture.

Please add comments on what you would like to see in a proposal doc for this work, which questions you would like answered, etc

Note: no need to start discussing ideas right here right now, save it for the proposal. This is just to sign off on what we want covered in a proposal doc

Callisto13 avatar Dec 02 '20 15:12 Callisto13

Look good to me, I think something that might be worth adding to the proposal is how to handle existing clusters that have been created from a config file, but then updated imperatively though commands like eksctl create nodegroup and eksctl upgrade cluster etc.

aclevername avatar Dec 02 '20 16:12 aclevername

excellent point!

Callisto13 avatar Dec 02 '20 16:12 Callisto13

Absolutely would love to see this. I would argue that this is the most important feature moving forward, because the main point of eksctl is that it simplifies the management of EKS clusters. However, we had to implement multiple workarounds and hacks already just because eksctl upgrade cluster -f config.yaml or eksctl upgrade nodegroup -f config.yaml commands don't want to honor changes made to config manifest, which kinda undermines the whole simplification part, as those hacks would not be necessary if something like terraform was used instead.

artem-nefedov avatar Dec 03 '20 19:12 artem-nefedov

To clarify for 2.), I don't see the alteration of any existing commands as in scope for this issue. The goal would be to make them unnecessary when using apply of course.

michaelbeaumont avatar Jan 04 '21 20:01 michaelbeaumont

👍 totally. I didn't mean "alteration", more "termination" 😈

Callisto13 avatar Jan 04 '21 21:01 Callisto13

I think 3) is sort of dependent on 1) being "finalized", right? That is, I don't think a proposal for the behavior of apply should need to answer implementation questions, although we should consider 4).

michaelbeaumont avatar Jan 06 '21 09:01 michaelbeaumont

🤔 Isn't explaining how we will do the thing one of the core parts of a design proposal? It's not like we can start work without it/a bunch of tickets

I mean, yes one does depend on the other, but they can be presented and discussed at the same time.

Callisto13 avatar Jan 06 '21 09:01 Callisto13

I would have thought the first step/proposal is something users might engage with. It would answer the question "how does eksctl apply behave with this config?".

EDIT: it would also answer questions around incrementally introducing support/changing configs, etc., all of the user visible changes.

michaelbeaumont avatar Jan 06 '21 09:01 michaelbeaumont

A complete proposal should cover all stages, all questions, and all risks, but we could get away with not providing them all at once? The danger (not really danger, inconvenience) is that sometimes risks and questions discovered later can end up influencing the behaviour, so we would have to circle back

Either way I am not bothered if we take each step separately so long as all information is covered before we start work.

Callisto13 avatar Jan 06 '21 09:01 Callisto13

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Feb 06 '21 01:02 github-actions[bot]

Just want to say that if this feature means I can stop gitops'ing eksctl with bash scripts, this would be such a win.

/going back into my hole now

bitva77 avatar May 22 '21 03:05 bitva77

I'd really like to see this too.

Use case: I wanted to attach a new iam addon policy to an existing cluster. I ended up doing:

  1. Update my cluster config file with the new addon
  2. Create a new node group with a new name (arbitrarily renamed it by adding -abc to the end of the name)
  3. Delete the old node group

This worked but it was a pretty big ordeal that took over 8 minutes while waiting for AWS and also performing destructive actions. This all boils down to adding a policy to a group of nodes right? Would be great if you could run 1 command that applies the new policies to your existing nodes.

nickjj avatar Jul 04 '21 17:07 nickjj

Is there any progress?

kahirokunn avatar Oct 12 '21 09:10 kahirokunn

Hi @kahirokunn not currently. We are still considering what the big-next steps should be for eksctl.

aclevername avatar Oct 14 '21 08:10 aclevername

This would really be the killer feature to be able to use eksctl e.g. in pipelines to roll-out clusters & cluster updates.

steffakasid avatar Jan 25 '22 08:01 steffakasid

Personal opinion, without this feature I'm not comfortable using eksctl in production. Very cool project though, I hope it keeps going.

EliMor avatar Apr 11 '22 22:04 EliMor

Things have improved in this area. We use eksctl in Production just fine. I mean, how often are you changing EKS control plane configuration? Like never. Node groups are the only thing we change and I'm not sure I want node groups to be deleted automatically yet...especially in Production.

bitva77 avatar Apr 11 '22 23:04 bitva77

Very cool project though, I hope it keeps going.

That's a bit passive-aggressive. :) It was created in 2018 and has been going strong for a while now, without any indication of stopping. :) And it just gets better and better.

While apply IS a killer feature, you can combine eksctl with flux easily to support this workflow by eksctl creating the necessary files and flux managing them. eksctl even sports a flux integration itself. Or you can use eksctl together with terraform to achieve a declarative description of your infrastructure.

We might eventually support apply, but there are many features that have precedence. Like, as supporting Karpenter out of the box, for example, allowing users to explore and use Karpenter in a friendly and easy way. :)

Skarlso avatar Apr 12 '22 05:04 Skarlso

There are a lot of things that can be safely changed in-place, but you have to perform a separate command for each of them, which is extremely annoying and un-declarative.

These include (but not limited to):

  • AWS resource tags
  • IAM serviceaccounts
  • public cluster API endpoint CIDR ranges

If eksctl apply could handle just those "safe" cases, that would already be wonderful.

artem-nefedov avatar Apr 12 '22 07:04 artem-nefedov

And we graciously accept community contributions for these features and will happily help get the PR ready and merged! :)

Skarlso avatar Apr 12 '22 07:04 Skarlso

One thing I want to suggest is that eksctl apply doesn't need to be a functionality within the eksctl only, even if there is a GitHub action which can simulate reconciliation, that is also good enough.

All we need to do is just place the right group of commands and hooks. e.g. Even if we don't have inplace node reconcillation, this can be done using

  1. Mandate an unique node group name (may include version or date in the name)
  2. Create and remove orphan node groups

However the most challenging part of the scripts would be recovering from errors and achieve Idempotency. If someone is already have these in bits and pieces as a community we can convert this to a GitHub action. Those who are new like me are actually looking for just idea on how to automate the stuff. Liked @Skarlso comment but no idea how to implement something like that.

rverma-dev avatar Apr 30 '22 01:04 rverma-dev

Hold on... there's no such thing as apply? Am I having a case of the Mandela Effect? I swear I've used eksctl apply before.

How was this not a thing from the beginning?

mhemken-vts avatar Jan 10 '23 23:01 mhemken-vts

We might eventually support apply, but there are many features that have precedence. Like, as supporting Karpenter out of the box, for example, allowing users to explore and use Karpenter in a friendly and easy way. :)

How does Karpenter have precedence over apply? Isn't the whole point of declarative infra that you can do idempotent applys repeatedly? I thought that was the whole point of this tool.

For those using this tool in production, what is the normal development cycle? Do you just create a new cluster when the old one's configuration inevitably becomes outdated?

mhemken-vts avatar Jan 10 '23 23:01 mhemken-vts

I think you're confusing eksctl with terraform

bitva77 avatar Jan 10 '23 23:01 bitva77

@mhemken-vts I'm creating a new cluster and new nodegroups when I need to change something. It's horrible, I know.

However, apply would be great only for cluster.

I honestly think its better to create new nodegroups and then delete old ones as if your change is problematic, your old nodesgroups are still working.

matti avatar Jan 11 '23 05:01 matti