cluster-api-provider-aws icon indicating copy to clipboard operation
cluster-api-provider-aws copied to clipboard

Create a eks-machinepool flavor

Open richardcase opened this issue 4 years ago • 50 comments

/kind feature /area provider/eks /help /good-first-issue

Describe the solution you'd like We should create a new template for a eks-machinepool flavor. This is so users can use clusterctl to create an EKS cluster that uses a machine pool. For example:

clusterctl config cluster my-cluster --kubernetes-version v1.16.8 --flavor eks-machinepool > my-cluster.yaml

This will invlove creating the template file and also updating the makefile to copy the template to the out folder and updating the release docs.

Anything else you would like to add: This is a follow up from #1863 and is the umanaged version of #2024

Environment:

  • Cluster-api-provider-aws version: 0.6.0

richardcase avatar Oct 14 '20 08:10 richardcase

@richardcase: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to this:

/kind feature /area provider/eks /help /good-first-issue

Describe the solution you'd like We should create a new template for a eks-machinepool flavor. This is so users can use clusterctl to create an EKS cluster that uses a machine pool. For example:

clusterctl config cluster my-cluster --kubernetes-version v1.16.8 --flavor eks-machinepool > my-cluster.yaml

This will invlove creating the template file and also updating the makefile to copy the template to the out folder and updating the release docs.

Anything else you would like to add: This is a follow up from #1863 and is the umanaged version of #2024

Environment:

  • Cluster-api-provider-aws version: 0.6.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 14 '20 08:10 k8s-ci-robot

I'll give this a try. /assign

kkeshavamurthy avatar Oct 16 '20 01:10 kkeshavamurthy

/lifecycle active

arghya88 avatar Oct 16 '20 06:10 arghya88

@kkeshavamurthy - reach out here or on slack if you need anything and thanks for working on this.

richardcase avatar Oct 16 '20 07:10 richardcase

@richardcase, Thanks I definitely will. Initially I thought I was supposed to do the flavor for eks-machinepool (Managed), hadn't noticed that it was already done. I'm not sure what unmanaged eks machinepool flavor looks like. I'll look into it a bit more. Any pointers would be very useful.

kkeshavamurthy avatar Oct 16 '20 17:10 kkeshavamurthy

The API definition for AWSMachinePool is here: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/master/exp/api/v1alpha3/awsmachinepool_types.go#L145

I think the falvor should have the minimum set of fields.

richardcase avatar Oct 16 '20 17:10 richardcase

/assign @richardcase For triage.

sedefsavas avatar Jun 28 '21 17:06 sedefsavas

This is a nice to have.

/milestone v0.7.x /priority backlog /unassign

richardcase avatar Jul 01 '21 14:07 richardcase

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 29 '21 15:09 k8s-triage-robot

Hi @richardcase, I'm a new contributor and willing to work on this good first issue. However, from the comments it looks like this was picked and then nothing has happened. Hence, just to confirm is this still a valid/on-going issue? If yes, may I pick this up?

vibhorrawat avatar Sep 30 '21 09:09 vibhorrawat

Hey @vibhorrawat - thanks for considering contributing. This is still a valid issue to work on. If you want to work on it feel free to assign it yourself and mark it active (see this comment for an example of how to do that). If you have any questions along the way feel free to ping me here or in the slack channel.

richardcase avatar Sep 30 '21 09:09 richardcase

@richardcase Great! Thank you! /assign /lifecycle active

vibhorrawat avatar Sep 30 '21 09:09 vibhorrawat

@richardcase Here is my understanding of the problem statement please feel free to correct me if I'm wrong here.

  • Create a new template (a yaml file) for eks-machinepool. For this, when I started looking at the templates folder, we have few templates like template-eks and template-machinepool. And now, we need a new one which will be a combination of both?
  • Second, updating the makefile to copy the template to the out folder. Are you referring to this line or is there something else required here. Would it be possible for you to share the pointers.
  • Third, release docs are here. Kindly confirm.

Since, I'm a newbie to k8s, I will try to setup the development environment locally by following the guide.

Please share your thoughts.

vibhorrawat avatar Oct 01 '21 17:10 vibhorrawat

Hi @vibhorrawat

Your understanding is correct :+1:

With update the makefile to copye the template this is already done. As well as updating the release notes documentation i'd also add a note to the machine pool docs to mention that there is a EKS with machine pool template as well.

richardcase avatar Oct 04 '21 09:10 richardcase

@richardcase question, does the Set up AWS Environment requires a separate aws account provided by CNCF or I can configure mine? I did not understand the suggested steps in documentation. Apologies for that. May be I lack the context of how clusterawsadm deploys the infrastructure on AWS.

Besides, for the template could you please help me with what kind should go in? Referring these two

  • cluster-template-machinepool.yaml
  • cluster-template-eks.yaml

For instance- The Cluster should have infrastructureRef as AWSCluster and controlPlaneRef as KubeadmControlPlane instead of AWSManagedControlPlane? And we will use EKSConfigTemplate for bootstrap in MachinePool

Please suggest. We can connect over slack if you around. Thank you

vibhorrawat avatar Oct 08 '21 12:10 vibhorrawat

👋 In regards to having this flavour, we would be very interested in seeing how it should be done.

We recently tried to evaluate non managed MachinePools for a AWSManagedControlPlane and found that EKSConfigTemplate does not seem to work with MachinePools, it does not seem to reconcile and we get the message Bootstrap data secret reference is not yet available. Creating a raw EKSConfig didn't seem to work either as the ownerReferences usually point towards a Machine in a MachineDeployment, so it is not reconciled.

It looks like the MachinePool controller doesn't support bootstrapRef, only the bootstrap name directly. https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/7c7ebfe7e2256a092caa8df1481a838feee35836/exp/controllers/awsmachinepool_controller.go#L218

Unless a webhook is supposed to mutate that field and the EKSConfig(Template) is just not being reconciled?

Jacobious52 avatar Oct 21 '21 06:10 Jacobious52

@Jacobious52 - EKS with AWSMachinePool should work and its something that we tested early on. But i have mostly used managed machine pools since and looking at the e2e we don't specifically test for unmanaged machine pools (i.e. AWSMachinePool), which we should.

It looks like the MachinePool controller doesn't support bootstrapRef, only the bootstrap name directly.

This line is a check to see if the EKS bootstrap controller has reconciled the EKS config yet (or not). In the controller if the machine pools feature flag is enabled then we watch on MachinePool :

https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/7c7ebfe7e2256a092caa8df1481a838feee35836/bootstrap/eks/controllers/eksconfig_controller.go#L221:L226

So the EKS config should be reconciled (and the secret created) for both AWSMachinePool and AWSManagedMachinePool.

I will have a look into this asap and also add it to the EKS e2e checks....unless you fancied taking a look?

richardcase avatar Oct 21 '21 09:10 richardcase

@richardcase - Thanks for pointing me in the right direction, I found https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/7c7ebfe7e2256a092caa8df1481a838feee35836/bootstrap/eks/controllers/eksconfig_controller.go#L309 and that the reason it wasn't reconciling for me was I referenced an EKSConfigTemplate like in the MachineDeployment flavour, instead of a concrete EKSConfig. When I tried creating an EKSConfig directly, I didn't update the MachinePool's reference to the new object. This makes more sense now given the eksconfig_controller is in charge of adding the MachinePool as the owner so it gets properly reconciled.

Correctly referencing the EKSConfig works for me now 👍 I guess adding tests will still be beneficial anyway.

Having this flavour provided in the templates should help others to avoid that confusion. Thanks again!


@vibhorrawat - For reference if it helps for your template my working bootstrap for MachinePool looks like below without the templating vars.

spec:
  bootstrap:
    configRef:
      apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
      kind: EKSConfig
      name: mp-eks-config
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: EKSConfig
metadata:
  name: mp-eks-config

And for the Cluster refs, both reference the AWSManagedControlPlane

spec:
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: AWSManagedControlPlane
    name: mp-control-plane
  infrastructureRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: AWSManagedControlPlane
    name: mp-control-plane

Jacobious52 avatar Oct 21 '21 12:10 Jacobious52

@Jacobious52 Thank you for the pointer, this will help for sure. Because, the template which I have prepared locally (was about to test it as soon as my development environment is ready) has EKSConfigTemplate.

@richardcase are we good to use EKSConfig as the kind in MachinePool -> spec -> bootstrap? Kindly suggest.

vibhorrawat avatar Oct 21 '21 13:10 vibhorrawat

Yes you should be able to, as per the example from @vibhorrawat

richardcase avatar Oct 21 '21 15:10 richardcase

Also created https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/2865 to update the eks e2e testing to cover AWSMachinePool

richardcase avatar Oct 21 '21 15:10 richardcase

@richardcase Hey there, Latest on this issue- I'm ready with the required template (added the aforementioned suggestions); however, I'm struggling to test that locally.

Here is what I have done till date-

  • I have followed the developer guide to build the kind cluster locally using tilt. Thanks to @sedefsavas I was unblocked and was enabled to execute tilt up command.
  • Next step was to validate if pods are up & running on the kind cluster. This is where the problem is, I'm unable to move forward. When I execute the following command kubectl get pods -A. there is no response or error 😞
  • After this, the important step is to spin up CAPA managed workload cluster using required (in my case, using newly created) template.

Your thoughts, any pointers to proceed?

Thank you!

vibhorrawat avatar Nov 10 '21 17:11 vibhorrawat

After creating a cluster with kind create cluster before tilt up, you should be able to see the pods.

https://kind.sigs.k8s.io/docs/user/quick-start/#creating-a-cluster

You might have an old kubeconfig path set to the KUBECONFIG env variable so unsetting that will default to the kubeconfig created by KinD.

After making sure kubernetes cluster comes up, tilt up and observe the cluster-api controller deployments from http://localhost:10350/

sedefsavas avatar Nov 10 '21 18:11 sedefsavas

Thank you @sedefsavas. I will give another try for it.

vibhorrawat avatar Nov 11 '21 12:11 vibhorrawat

Thank you @sedefsavas. I will give another try for it.

@vibhorrawat - feel free to ping me on slack as well if you need help.

richardcase avatar Nov 11 '21 13:11 richardcase

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 11 '21 13:12 k8s-triage-robot

/lifecycle frozen

richardcase avatar Dec 11 '21 13:12 richardcase

Hey @vibhorrawat are you still working on this issue?

Ankitasw avatar Mar 08 '22 15:03 Ankitasw

Hi @Ankitasw yes! I'm working on this ticket. The status is like I'm ready with the said template; however, I'm running into an issue whilst testing the template locally. I did get in touch with @richardcase and have shared my observation around the issue. Honestly, this was before the Christmas holidays and unable to get back at it. Apologies for the delay caused here from my end. I will get in touch with Richard during this week and will get his thoughts on it. Besides, if you can help me with the troubleshooting that would be great. I can share the observations here in the comment. Let me know your thoughts. Thank you!

vibhorrawat avatar Mar 08 '22 18:03 vibhorrawat

@vibhorrawat yes, please share your observations, we will try to help.

Ankitasw avatar Mar 09 '22 03:03 Ankitasw