kube-ingress-aws-controller icon indicating copy to clipboard operation
kube-ingress-aws-controller copied to clipboard

AWS IAM Policy and documentation update / enhancement

Open cmcconnell1 opened this issue 7 years ago • 13 comments

Hello,

Thanks for putting this out, looking forward to testing this.

In reading ref kube-ingress-aws-controller prerequisites Specifically: "The full list of required AWS IAM roles is the following"

And the code:

{
    "Action": "elasticloadbalancingv2:DescribeTargetGroups",
    "Resource": "*",
    "Effect": "Allow"
},

According to the doc, these seem to be intended for an IAM policy (used with a role) when in fact, elasticloadbalancingv2 is not valid there--ref: IAM Policy Actions Grouped by Access Level | Service Actions Included in the Read Access Level

However, elasticloadbalancingv2 is valid in cloudformation though

What would be very helpful for us (and perhaps others) would be providing a valid AWS IAM policy file that we could add to our kubernetes deployment code/processes/addons, and simply attach to our existing kubernetes (worker) roles so that our existing clusters' worker nodes could use kube-ingress-aws-controller.

This would allow something like this on the fly (assuming we also use aws cloudformation to create the requisite load-balancer-elb-traffic too):

aws iam create-policy --policy-name zalando-incubator-kube-ingress-aws  --policy-document alb-ingress-workernodes-policy.json

And then attach the newly created policy to our requisite kube cluster worker roles. . .

Regarding the proposed/requested AWS IAM Policy file and what I think the referenced requirements document could use. . . something like this below would be great (just threw together as example--not sure this is correct)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "kubeIngressAwsControllerWorkerNodesPolicy",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "iam:ListServerCertificates",
                "cloudformation:Delete*",
                "autoscaling:DescribeAutoScalingGroups",
                "acm:ListCertificates",
                "cloudformation:List*",
                "ec2:DescribeRouteTables",
                "autoscaling:AttachLoadBalancers",
                "iam:GetServerCertificate",
                "autoscaling:DetachLoadBalancers",
                "elasticloadbalancing:*",
                "cloudformation:Create*",
                "autoscaling:DetachLoadBalancerTargetGroups",
                "ec2:DescribeSecurityGroups",
                "cloudformation:Describe*",
                "autoscaling:AttachLoadBalancerTargetGroups",
                "acm:DescribeCertificate",
                "ec2:DescribeVpcs",
                "ec2:DescribeSubnets",
                "cloudformation:Get*"
            ],
            "Resource": "*"
        }
    ]
}

Hopefully this is within scope as this would really help us to evaluate (and hopefully implement) this project.

Thank you, And thanks for @sszuecs for being available and responsive in Slack, etc.

cmcconnell1 avatar Dec 06 '17 19:12 cmcconnell1

Some additional snags I hit if these look familiar please let me know.

Curious if someone could let me know if we need additional tags or a specific name for the cloudformation created controller SG the docs list as a prerequisite (link noted at top initial post).

I think I've created the requisite SG, but the pod cannot find it:

alb_ingress_pod=$(kk get po | grep ingress | awk '{print $1}') && kk logs $alb_ingress_pod
2017/12/06 21:32:52 starting /bin/kube-ingress-aws-controller
2017/12/06 21:32:52 required security group was not found

excerpt from awless showing security group created from the prereq's doc page using CF:

awless ls securitygroups
|    ID ▲     |     VPC      |               INBOUND               |              OUTBOUND               |                NAME                 |             DESCRIPTION             |
|-------------|--------------|-------------------------------------|-------------------------------------|-------------------------------------|-------------------------------------|
| sg-ff123456 | vpc-abc12345 | [0.0.0.0/0](tcp:443)                | [0.0.0.0/0](any)                    | zalando-incubator-kube-ingress-aws- | zalando-incubator-kube-ingress-aws- |
|             |              | [0.0.0.0/0](tcp:80)                 |                                     | controller-lb-sg-                   | controller-lb-sg                    |
|             |              |                                     |                                     | IngressLoadBalancerSecurityGroup-   |                                     |
|             |              |                                     |                                     | 4A8ZY7E8R40I                        |                                     |

I've created what I think to be a requisite IAM policy and attached that policy to the kube cluster woker role.

So, I know there are issues thus far. Additionally, if this looks familiar, hitting an issue when trying to test skipper (will check to see if there is a min kube version testing on 1.7.4):

kubectl apply -f deploy/skipper.yaml
error: error validating "deploy/skipper.yaml": error validating data: [ValidationError(DaemonSet.spec.template.metadata): unknown field "containers" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(DaemonSet.spec.template.metadata): unknown field "hostNetwork" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta]; if you choose to ignore these errors, turn validation off with --validate=false
kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T19:11:22Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.4+coreos.0", GitCommit:"4bb697e04f7c356347aee6ffaa91640b428976d5", GitTreeState:"clean", BuildDate:"2017-08-22T08:43:47Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Thanks

cmcconnell1 avatar Dec 06 '17 21:12 cmcconnell1

@cmcconnell1 this sounds totally valid. I tried to extract from our current config the relevant parts, I hope this helps:

    "IngressControllerIAMRole": {
        "Properties": {
            "AssumeRolePolicyDocument": {
                "Statement": [
                    {
                        "Action": [
                            "sts:AssumeRole"
                        ],
                        "Effect": "Allow",
                        "Principal": {
                            "Service": [
                                "ec2.amazonaws.com"
                            ]
                        }
                    },
                    {
                        "Action": [
                            "sts:AssumeRole"
                        ],
                        "Effect": "Allow",
                        "Principal": {
                            "AWS": {
                                "Fn::Join": [
                                    "",
                                    [
                                        "arn:aws:iam::170858875137:role/",
                                        {
                                            "Ref": "WorkerIAMRole"
                                        }
                                    ]
                                ]
                            }
                        }
                    }
                ],
                "Version": "2012-10-17"
            },
            "Path": "/",
            "Policies": [
                {
                    "PolicyDocument": {
                        "Statement": [
                            {
                                "Action": "acm:ListCertificates",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "acm:DescribeCertificate",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "autoscaling:DescribeAutoScalingGroups",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "autoscaling:AttachLoadBalancers",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "autoscaling:DetachLoadBalancers",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "autoscaling:DetachLoadBalancerTargetGroups",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "autoscaling:AttachLoadBalancerTargetGroups",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "cloudformation:*",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "elasticloadbalancing:*",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "elasticloadbalancingv2:*",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "ec2:DescribeInstances",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "ec2:DescribeSubnets",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "ec2:DescribeSecurityGroups",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "ec2:DescribeRouteTables",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "ec2:DescribeVpcs",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "iam:GetServerCertificate",
                                "Effect": "Allow",
                                "Resource": "*"
                            },
                            {
                                "Action": "iam:ListServerCertificates",
                                "Effect": "Allow",
                                "Resource": "*"
                            }
                        ],
                        "Version": "2012-10-17"
                    },
                    "PolicyName": "root"
                }
            ],
            "RoleName": "foo-1-app-ingr-ctrl"
        },
        "Type": "AWS::IAM::Role"
    },
    "IngressLoadBalancerSecurityGroup": {
        "Properties": {
            "GroupDescription": {
                "Ref": "AWS::StackName"
            },
            "SecurityGroupIngress": [
                {
                    "CidrIp": "0.0.0.0/0",
                    "FromPort": 80,
                    "IpProtocol": "tcp",
                    "ToPort": 80
                },
                {
                    "CidrIp": "0.0.0.0/0",
                    "FromPort": 443,
                    "IpProtocol": "tcp",
                    "ToPort": 443
                }
            ],
            "Tags": [
                {
                    "Key": "kubernetes.io/cluster/foo2",
                    "Value": "owned"
                },
                {
                    "Key": "kubernetes:application",
                    "Value": "kube-ingress-aws-controller"
                }
            ],
            "VpcId": "vpc-eecc9787"
        },
        "Type": "AWS::EC2::SecurityGroup"
    },

szuecs avatar Dec 06 '17 22:12 szuecs

you have to add the following tags to your AWS Loadbalancer SecurityGroup before updating:

  • kubernetes:application=kube-ingress-aws-controller
  • kubernetes.io/cluster/<cluster-id>=owned

The same is shown in the comment before.

szuecs avatar Dec 06 '17 22:12 szuecs

Hi @szuecs the SG that I created does have those as well as others, here's the complete list of Key/Values for the CF created SG:

key     Value
Name
aws:cloudformation:logical-id IngressLoadBalancerSecurityGroup
aws:cloudformation:stack-id arn:aws:cloudformation:us-west-1:123456789:stack/zalando-incubator-kube-ingress-aws-controller-lb-sg/884200e0-dacc-11e7-b8bf-50fae8e73cad
aws:cloudformation:stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg
kubernetes.io/cluster/opsdev owned
kubernetes:application kube-ingress-aws-controller

Does that look correct? Thanks

cmcconnell1 avatar Dec 06 '17 23:12 cmcconnell1

@cmcconnell1 make sure you also have this tag kubernetes.io/cluster/opsdev=owned on the node where the ingress-controller is running. This is how it discovers the clusterID before looking up the SG.

We also support the old tag KubernetesCluster=<clusterid> which is for instances in place for clusters deployed via kops.

mikkeloscar avatar Dec 06 '17 23:12 mikkeloscar

OK, thanks @mikkeloscar great catch!

on the worker nodes from that test cluster I currently have

kubernetes.io/cluster/opsdev   true

So the tag value is set to "true" and not "owned"
I think that value gets set by kube-aws, but will need to track down. Will do some more digging on this.

Thanks!

cmcconnell1 avatar Dec 06 '17 23:12 cmcconnell1

Excellent, so instead of changing the above mentioned tag (kubernetes.io/cluster/$cluster_id), as @mikkeloscar noted above, using the now legacy tag/values:

KubernetesCluster   opsdev

did the trick!

alb_ingress_pod=$(kk get po | grep ingress | awk '{print $1}') && kk logs $alb_ingress_pod
2017/12/06 23:39:56 starting /bin/kube-ingress-aws-controller
2017/12/06 23:39:59 controller manifest:
2017/12/06 23:39:59   kubernetes API server:
2017/12/06 23:39:59   Cluster ID: opsdev
2017/12/06 23:39:59   vpc id: vpc-abc12345
2017/12/06 23:39:59   instance id: i-0bc22c623456a36d5
2017/12/06 23:39:59   auto scaling group name: opsdev-Nodepool1a-1RZ8Z2DRHD0RB-Workers-8HDQKVTYW7VP
2017/12/06 23:39:59   security group id: sg-xxxxx
2017/12/06 23:39:59   private subnet ids: xxxx
2017/12/06 23:39:59   public subnet ids: xxxx
2017/12/06 23:39:59 Start polling sleep 30s
2017/12/06 23:40:29 Found 0 ingresses
2017/12/06 23:40:29 Found 0 stacks
2017/12/06 23:40:29 Have 0 models
2017/12/06 23:40:29 Start polling sleep 30s

Thanks!

cmcconnell1 avatar Dec 06 '17 23:12 cmcconnell1

In case this is useful for docs or others coming here, below are the steps I used based on the (modified as needed) files from your repo and using the above noted AWS IAM policy file:

aws iam create-policy --policy-name zalando-incubator-kube-ingress-aws  --policy-document file:///zalando-incubator-kube-ingress-aws-controller/iam/workernodes-policy.json --description 'requisite policy to attach to each kube clusters kube2iam role i.e.: us-west-1-opsdevWorkerMR'

List your newly created IAM Policy and get its ARN

aws iam list-policies | grep -i zalando-incubator-kube-ingress-aws

Attach the policy to your cluster workers role using the generated AWS ARN in above step

aws iam attach-role-policy --policy-arn arn:aws:iam::012345678901:policy/zalando-incubator-kube-ingress-aws --role-name us-west-1-opsdevWorkerMR

Validate and list roles attached policies for our cluster worker role

aws iam list-attached-role-policies --role-name us-west-1-opsdevWorkerMR

Create the requisite CF stack for the controller LB SG

aws cloudformation create-stack --stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg --template-body file:///kube-ingress-aws-controller-create-sg-traffic-to-loadbalancer.yaml --tags Key=Description,Value=IngressLoadBalancerSecurityGroup

Describe stack if desired

aws cloudformation describe-stacks --stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg

cmcconnell1 avatar Dec 07 '17 00:12 cmcconnell1

@cmcconnell1 would you like to create a doc PR for it? It would be awesome to get this into the docs to make it better.

szuecs avatar Dec 07 '17 15:12 szuecs

@cmcconnell1 would you like to create a doc PR for it? It would be awesome to get this into the docs to make it better.

szuecs avatar Dec 07 '17 15:12 szuecs

I would like to do that.

Would it be better to do a separate PR for the above requisite steps to deploy the kube-ingress-aws-controller as an addon (for other installed methods (kube-aws, etc.) and for existing clusters):

  • create the AWS IAM policies and apply to kube worker role(s)
  • create AWS CF SG
  • perform the requisite kube node tagging, etc.

so that the alb-ingress can run (and then perhaps create another PR for whatever else we'll need to do for skipper)?

Or would it be better to wait until all issues are resolved and create one PR for all--as there are still blocking issues for me--current skipper error as noted above (reposting what I hit yesterday):

basename `pwd` && kubectl apply -f deploy/skipper.yaml
kube-ingress-aws-controller
error: error validating "deploy/skipper.yaml": error validating data: [ValidationError(DaemonSet.spec.template.metadata): unknown field "containers" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(DaemonSet.spec.template.metadata): unknown field "hostNetwork" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta]; if you choose to ignore these errors, turn validation off with --validate=false

Regarding skipper: I was not able to follow the readme doc from this repo on how to deploy skipper due to the above error. Note that to get skipper to actually deploy via kubectl without errors, I created skipper.yaml files from code excerpts I pulled out of the skipper repo readme 3-minutes-skipper-in-kubernetes-introduction section, as there were no example .yaml config files in the skipper repo, and the files in the kube-ingress-aws-controller didn't work for me per above error.

pwd
skipper
git remote -v
origin	[email protected]:zalando/skipper.git (fetch)
origin	[email protected]:zalando/skipper.git (push)

find . -type f -name \*.yaml
./.catwatch.yaml
./.zappr.yaml
./delivery.yaml
./glide.yaml

So I created these files from code excerpts from the readme in the skipper repo

ls skipper-*
skipper-demo-deployment.yaml	skipper-demo-ing.yaml		skipper-demo-svc.yaml		skipper-ingress-ds.yaml

And was able to create them via kubectl without error(s).

Thanks

cmcconnell1 avatar Dec 07 '17 19:12 cmcconnell1

@cmcconnell1 #107 fixes the skipper daemonset yaml. I stripped too much and removed the POD spec key. For your suggestions regarding kube-aws and other cluster creations, if you know one of them you could just add a section for this one, such that there is a nice starting point. Contribute as much docs as you like, but it also does not have to be too much work for you. ;) The deploy steps make totally sense to me and thanks for your help!

szuecs avatar Dec 07 '17 22:12 szuecs

@cmcconnell1 regarding skipper we have now https://opensource.zalando.com/skipper/kubernetes/ingress-controller/ maybe this helps. I am just reviewing GH issues in this project.

szuecs avatar Jun 09 '18 13:06 szuecs