cluster-api-provider-aws icon indicating copy to clipboard operation
cluster-api-provider-aws copied to clipboard

Add flavour for using AWS VPC CNI

Open Sn0rt opened this issue 6 years ago • 22 comments

/kind feature

from my backgroup, I consider to use native network work with k8s, like EKS. however, cluster-api-provider-aws not support it yet. can we consider to support amazon-vpc-cni-k8s support ? or accept a PR to implement this feature ?

Sn0rt avatar Jul 26 '19 09:07 Sn0rt

@Sn0rt I don't see any issues with adding support for amazon-vpc-cni-k8s.

At a cursory glance I believe it would require:

  • Swapping out the Calico manifests in addons.yaml with the manifests for amazon-vpc-cni-k8s
  • Updating the nodes.cluster-api-provider-aws.sigs.k8s.io policy as per the docs
  • The user to override the kubelet --max-pods appropriately for each Machine* object they define to avoid overscheduling any individual Node.

/cc @randomvariable cc'ing Naadir in case he has thoughts on how we can potentially scope down the IAM permissions needed vs the broad permissions listed in the docs.

detiber avatar Jul 26 '19 14:07 detiber

:+1:

We're big VPC CNI users and would be happy to help out on this.

/cc @sethp-nr

rudoi avatar Jul 26 '19 17:07 rudoi

Can do either of the following:

  • Create a new policy and attach it to the node manually
  • Add an option to the CloudFormation generation to do it automatically. Might be better not to modify the existing policies.

randomvariable avatar Jul 29 '19 14:07 randomvariable

FWIW this works today by applying a custom policy to the control plane machines and worker nodes with the Machine's spec.providerSpec.value.iamInstanceProfile. It doesn't look like any of the ENI stuff is scoped to the CAPA tag(s), despite some evidence that we wanted to – @rudoi do you remember if we tried to get the CNI permissions to be scoped to just the CAPA machines?

sethp-nr avatar Jul 29 '19 17:07 sethp-nr

I finished a POC

1: create a cluster

create the cluster and the control panel

apiVersion: "cluster.k8s.io/v1alpha1"
kind: Cluster
metadata:
  name: aws-eni
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
  providerSpec:
    value:
      apiVersion: "awsprovider/v1alpha1"
      kind: "AWSClusterProviderSpec"
      region: "us-east-2"
      sshKeyName: "guohao"

2: create machine deployment

apiVersion: "cluster.k8s.io/v1alpha1"
kind: MachineDeployment
metadata:
  name: aws-eni-machinedeployment
  labels:
    cluster.k8s.io/cluster-name: aws-eni
spec:
  replicas: 1
  selector:
    matchLabels:
      cluster.k8s.io/cluster-name: aws-eni
      set: node
  template:
    metadata:
      labels:
        cluster.k8s.io/cluster-name: aws-eni
        set: node
    spec:
      versions:
        kubelet: v1.14.4
      providerSpec:
        value:
          apiVersion: awsprovider/v1alpha1
          kind: AWSMachineProviderSpec
          instanceType: "t2.medium"
          iamInstanceProfile: "nodes.cluster-api-provider-aws.sigs.k8s.io"
          keyName: "guohao"

3: create the IAM permission

create a policy, there is assign permission to the node.

guohao@buffer ~ $ aws iam get-policy --policy-arn arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM
{
    "Policy": {
        "PolicyName": "amazon-vpc-cni-k8s-IAM",
        "PolicyId": "ANPASTTAGUKRHOLMEGMU2",
        "Arn": "arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 1,
        "PermissionsBoundaryUsageCount": 0,
        "IsAttachable": true,
        "Description": "the permission of aws eni",
        "CreateDate": "2019-08-09T02:35:54Z",
        "UpdateDate": "2019-08-09T02:35:54Z"
    }
}

4: attache the permission policy of AWS-ENI-CNI to nodes.cluster-api-provider-aws.sigs.k8s.io role, which is set to work node

guohao@buffer ~ $ aws iam list-attached-role-policies --role-name nodes.cluster-api-provider-aws.sigs.k8s.io

and the output as follows

{
    "AttachedPolicies": [
        {
            "PolicyName": "amazon-vpc-cni-k8s-IAM",
            "PolicyArn": "arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM"
        },
        {
            "PolicyName": "nodes.cluster-api-provider-aws.sigs.k8s.io",
            "PolicyArn": "arn:aws:iam::179516646050:policy/nodes.cluster-api-provider-aws.sigs.k8s.io"
        }
    ]
}

5: check the node of the cluster

get the kubeconfig by the clusterctl.

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig get node
NAME                                       STATUS     ROLES    AGE   VERSION
ip-10-0-0-133.us-east-2.compute.internal   NotReady   master   18h   v1.14.4
ip-10-0-0-172.us-east-2.compute.internal   NotReady   node     17h   v1.14.4

6: apply the aws-eni-ds

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/v1.5/aws-k8s-cni.yaml
clusterrole.rbac.authorization.k8s.io/aws-node created
serviceaccount/aws-node created
clusterrolebinding.rbac.authorization.k8s.io/aws-node created
daemonset.apps/aws-node created
customresourcedefinition.apiextensions.k8s.io/eniconfigs.crd.k8s.amazonaws.com created

7: check the pod status

you can found the pod is createing status too long.

kube-system   coredns-584795fc57-lnn5h                                           0/1     ContainerCreating   0          20h    <none>       ip-10-0-0-133.us-east-2.compute.internal   <none>           <none>
kube-system   coredns-584795fc57-nmcsj                                           0/1     ContainerCreating   0          20h    <none>       ip-10-0-0-133.us-east-2.compute.internal   <none>           <none>

delete it, and k8s will rebuild the pod.

kube-system   coredns-584795fc57-ztmbx                                           1/1     Running             0          22m    10.0.0.237   ip-10-0-0-172.us-east-2.compute.internal   <none>           <none>

8: the ip pool which is assigned to ec2 instance

get the status of the instance

guohao@buffer ~ $ aws ec2 describe-instances  --instance-id i-053b8794d7f90a110

{
    "Reservations": [
        {
....
                    "NetworkInterfaces": [
                        {
                            "Attachment": {
                                "AttachTime": "2019-08-08T09:06:33.000Z",
                                "AttachmentId": "eni-attach-09285aff116268f94",
                                "DeleteOnTermination": true,
                                "DeviceIndex": 0,
                                "Status": "attached"
                            },
                            "Description": "",
                            "Groups": [
                                {
                                    "GroupName": "aws-eni-lb",
                                    "GroupId": "sg-021eaefb3018d0551"
                                },
                                {
                                    "GroupName": "aws-eni-node",
                                    "GroupId": "sg-04cfe4c2052f87031"
                                }
                            ],
                            "Ipv6Addresses": [],
                            "MacAddress": "02:8e:50:b0:02:8a",
                            "NetworkInterfaceId": "eni-03b66efcc616b8c86",
                            "OwnerId": "179516646050",
                            "PrivateDnsName": "ip-10-0-0-172.us-east-2.compute.internal",
                            "PrivateIpAddress": "10.0.0.172",
                            "PrivateIpAddresses": [
                                {
                                    "Primary": true,
                                    "PrivateDnsName": "ip-10-0-0-172.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.172"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-232.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.232"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-170.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.170"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-237.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.237"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-205.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.205"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-222.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.222"
                                }
                            ],
                            "SourceDestCheck": true,
                            "Status": "in-use",
                            "SubnetId": "subnet-0892c669597c0a9aa",
                            "VpcId": "vpc-0eadd8ecf99f5b4c6",
                            "InterfaceType": "interface"
                        },
                        {
                            "Attachment": {
                                "AttachTime": "2019-08-09T05:17:24.000Z",
                                "AttachmentId": "eni-attach-0591fb5b94cb67eb8",
                                "DeleteOnTermination": true,
                                "DeviceIndex": 1,
                                "Status": "attached"
                            },
                            "Description": "aws-K8S-i-053b8794d7f90a110",
                            "Groups": [
                                {
                                    "GroupName": "aws-eni-lb",
                                    "GroupId": "sg-021eaefb3018d0551"
                                },
                                {
                                    "GroupName": "aws-eni-node",
                                    "GroupId": "sg-04cfe4c2052f87031"
                                }
                            ],
                            "Ipv6Addresses": [],
                            "MacAddress": "02:58:f9:8c:b5:3c",
                            "NetworkInterfaceId": "eni-0da443a1cf644f334",
                            "OwnerId": "179516646050",
                            "PrivateDnsName": "ip-10-0-0-56.us-east-2.compute.internal",
                            "PrivateIpAddress": "10.0.0.56",
                            "PrivateIpAddresses": [
                                {
                                    "Primary": true,
                                    "PrivateDnsName": "ip-10-0-0-56.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.56"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-183.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.183"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-74.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.74"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-91.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.91"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-235.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.235"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-236.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.236"
                                }
                            ],
                            "SourceDestCheck": true,
                            "Status": "in-use",
                            "SubnetId": "subnet-0892c669597c0a9aa",
                            "VpcId": "vpc-0eadd8ecf99f5b4c6",
                            "InterfaceType": "interface"
                        }
                    ],
                    "RootDeviceName": "/dev/sda1",
                    "RootDeviceType": "ebs",
                    "SecurityGroups": [
                        {
                            "GroupName": "aws-eni-lb",
                            "GroupId": "sg-021eaefb3018d0551"
                        },
                        {
                            "GroupName": "aws-eni-node",
                            "GroupId": "sg-04cfe4c2052f87031"
                        }
                    ],
                    "SourceDestCheck": true,
                    "Tags": [
                        {
                            "Key": "sigs.k8s.io/cluster-api-provider-aws/role",
                            "Value": "node"
                        },
                        {
                            "Key": "sigs.k8s.io/cluster-api-provider-aws/cluster/aws-eni",
                            "Value": "owned"
                        },
                        {
                            "Key": "Name",
                            "Value": "aws-eni-machinedeployment-5745b4948d-tg55f"
                        },
                        {
                            "Key": "kubernetes.io/cluster/aws-eni",
                            "Value": "owned"
                        }
                    ],
 ...
    ]
}

and check the eni ds status

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig logs aws-node-lc7ph -n kube-system
====== Starting amazon-k8s-agent ======
Checking if ipamd is serving
Waiting for ipamd health check
Ipamd is up and serving
Copying AWS CNI plugin and config
Node ready, watching ipamd health

Sn0rt avatar Aug 09 '19 05:08 Sn0rt

FWIW this works today by applying a custom policy to the control plane machines and worker nodes with the Machine's spec.providerSpec.value.iamInstanceProfile. It doesn't look like any of the ENI stuff is scoped to the CAPA tag(s), despite some evidence that we wanted to – @rudoi do you remember if we tried to get the CNI permissions to be scoped to just the CAPA machines?

hi, are you still work this feature?

Sn0rt avatar Aug 16 '19 05:08 Sn0rt

/assign

Sn0rt avatar Aug 16 '19 05:08 Sn0rt

Folks, just as a reminder, use /lifecycle active if you're actively working on something 😃

vincepri avatar Aug 16 '19 17:08 vincepri

@Sn0rt It's working for us as-is, so we haven't touched it in quite a while. Feel free to pick this ticket up!

sethp-nr avatar Aug 16 '19 21:08 sethp-nr

/lifecycle active

Sn0rt avatar Aug 18 '19 10:08 Sn0rt

@sethp-nr

We should consider a cluster-level flag to indicate the current cluster's CNI solution.

The max-pod parameter of amazon-vpc-cni-k8s depends on the type of instance which can be found here.

such as I consider set a cluster-level Annotation as follow.

or labels?

from my experience, the annotation to configure and labels to select some elements.

apiVersion: "cluster.k8s.io/v1alpha1"
kind: Cluster
metadata:
  name: test1
  annotation:
    cluster.k8s.io/network-cni: amazon-vpc-cni-k8s // support amazon-vpc-cni-k8s, calico
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
  providerSpec:
    value:
      apiVersion: "awsprovider/v1alpha1"
      kind: "AWSClusterProviderSpec"
      region: "us-east-2"
      sshKeyName: "guohao"

then the CAPA controller can set the parameter of kubelet by this cluster-level label.

what do you think?

Sn0rt avatar Aug 19 '19 08:08 Sn0rt

/milestone v0.5.0

ncdc avatar Oct 10 '19 15:10 ncdc

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Mar 05 '20 18:03 fejta-bot

/lifecycle frozen

detiber avatar Mar 05 '20 19:03 detiber

Given https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/1747 allows customisation of CNI rules, and clusterawsadm now allows customisation of policies, it should be easier to add a template flavour that uses the AWS VPC CNI.

/help

randomvariable avatar Aug 14 '20 13:08 randomvariable

@randomvariable: This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

Given https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/1747 allows customisation of CNI rules, and clusterawsadm now allows customisation of policies, it should be easier to add a template flavour that uses the AWS VPC CNI.

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 14 '20 13:08 k8s-ci-robot

I think we can close this now as VPC CNI is automtically installed in EKS and is available as a EKS addon to set the specific version.

richardcase avatar Jun 28 '21 14:06 richardcase

This issue is not only related to EKS side, so reopening it to track adding a template for AWS native CNI with unmanaged clusters.

sedefsavas avatar Mar 07 '22 23:03 sedefsavas

/remove-lifecycle frozen

richardcase avatar Jul 08 '22 21:07 richardcase

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 06 '22 22:10 k8s-triage-robot

/remove-lifecycle stale

richardcase avatar Oct 10 '22 10:10 richardcase

/triage accepted

Skarlso avatar Oct 31 '22 16:10 Skarlso

This is ultimately a documentation that should define how to install calico using clusterResourceSet or AddonProviders ( which will eventually deprecate ClusterResourceSet ).

Skarlso avatar Oct 31 '22 16:10 Skarlso

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 08 '23 11:02 k8s-triage-robot