cilium icon indicating copy to clipboard operation
cilium copied to clipboard

CFP: Update Cilium Helm install docs for EKS and the AWS VPC CNI

Open caleb-devops opened this issue 1 year ago • 8 comments

Cilium Feature Proposal

Is your proposed feature related to a problem?

The documentation for installing CIlium in EKS with Helm currently recommends patching the VPC CNI with kubectl to enable Cilium to manage ENIs instead of the VPC CNI. While this does work, it adds a manual step that prevents bootstrapping a Cilium EKS cluster using Terraform or eksctl.

# Relevant code
kubectl -n kube-system patch daemonset aws-node --type='strategic' -p='{"spec":{"template":{"spec":{"nodeSelector":{"io.cilium/aws-node-enabled":"true"}}}}}'

Describe the feature you'd like

Please update the docs to instead recommend using addon configuration values to patch the vpc-cni at the time it's deployed. Please note that nodeSelector is not a value that can be configured, so instead, affinity must be used.

The VPC CNI can be configured to not run on Cilium managed nodes using the following configuration values:

{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"io.cilium/aws-node-enabled","operator":"In","values":["true"]}]}]}}}}

caleb-devops avatar Feb 28 '24 20:02 caleb-devops

Sounds like this would be quite helpful, next step would be creating a concerete PR proposal.

joestringer avatar Mar 04 '24 22:03 joestringer

Hi @caleb-devops , thanks for the tip but when I put this configuration prior to cilium install the coredns addon doesn't start. (Obviously because no CNI are found).

Smana avatar Mar 26 '24 14:03 Smana

Hi @Smana. CoreDNS requires that the CNI is deployed, so with the vpc-cni configuration values in place, Cilium will need to be installed before CoreDNS can run. The recommended node taint should prevent other pods (like coredns) from being scheduled on the node until Cilium is deployed.

  taints:
   - key: "node.cilium.io/agent-not-ready"
     value: "true"
     effect: "NoExecute"

caleb-devops avatar Mar 27 '24 05:03 caleb-devops

Thx @caleb-devops , Actually I already have a toleration. However the cilium install only starts after the EKS module deployment is finished (including CoreDNS which is an EKS addon).

Smana avatar Mar 27 '24 08:03 Smana

@Smana you don't need to add the toleration to CoreDNS. Because CoreDNS relies on the CNI, it will need to be deployed after Cilium is installed. For the terraform-aws-modules/eks/aws module, try the following:

  1. Set vpc-cni configuration_values in the terraform-aws-modules/eks/aws module

      cluster_addons = {
        vpc-cni = {
          most_recent    = true
          before_compute = true
    
          configuration_values = jsonencode({
            affinity = {
              nodeAffinity = {
                requiredDuringSchedulingIgnoredDuringExecution = {
                  nodeSelectorTerms = [{
                    matchExpressions = [{
                      key      = "io.cilium/aws-node-enabled"
                      operator = "In"
                      values   = ["true"]
                    }]
                  }]
                }
              }
            }
          })
        }
      }
    
  2. Install the Cilium Helm chart using the Terraform Helm provider

  3. Install remaining addons (I use the terraform-aws-eks-blueprints-addons module for this)

caleb-devops avatar Mar 28 '24 14:03 caleb-devops

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar May 28 '24 01:05 github-actions[bot]

The AWS EKS team will be adding an option to initialize a bare EKS cluster (without any addons) through https://github.com/aws/containers-roadmap/issues/923. After they do, it should no longer be necessary to patch the VPC CNI to disable it.

caleb-devops avatar Jun 06 '24 18:06 caleb-devops

EKS clusters can now be created without any addons: https://aws.amazon.com/about-aws/whats-new/2024/06/amazon-eks-cluster-creation-flexibility-networking-add-ons/

caleb-devops avatar Jun 27 '24 22:06 caleb-devops

@caleb-devops may I know your eventual script to setup eks together with cilium in one go?

truongnht avatar Aug 01 '24 11:08 truongnht

@caleb-devops may I know your eventual script to setup eks together with cilium in one go?

@caleb-devops I am very interested in this too, I'm deploying a bare EKS cluster and there's some very strange order-of-eventing going on with coreDNS refusing to become healthy (and thus the nodes stall out in not ready state)

jgalliers avatar Sep 17 '24 06:09 jgalliers

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar Dec 08 '24 02:12 github-actions[bot]

This issue has not seen any activity since it was marked stale. Closing.

github-actions[bot] avatar Dec 23 '24 02:12 github-actions[bot]