kube-ingress-aws-controller
kube-ingress-aws-controller copied to clipboard
AWS IAM Policy and documentation update / enhancement
Hello,
Thanks for putting this out, looking forward to testing this.
In reading ref kube-ingress-aws-controller prerequisites Specifically: "The full list of required AWS IAM roles is the following"
And the code:
{
"Action": "elasticloadbalancingv2:DescribeTargetGroups",
"Resource": "*",
"Effect": "Allow"
},
According to the doc, these seem to be intended for an IAM policy (used with a role) when in fact, elasticloadbalancingv2
is not valid there--ref:
IAM Policy Actions Grouped by Access Level | Service Actions Included in the Read Access Level
However, elasticloadbalancingv2
is valid in cloudformation
though
What would be very helpful for us (and perhaps others) would be providing a valid AWS IAM policy file that we could add to our kubernetes deployment code/processes/addons, and simply attach to our existing kubernetes (worker) roles so that our existing clusters' worker nodes could use kube-ingress-aws-controller.
This would allow something like this on the fly (assuming we also use aws cloudformation to create the requisite load-balancer-elb-traffic too):
aws iam create-policy --policy-name zalando-incubator-kube-ingress-aws --policy-document alb-ingress-workernodes-policy.json
And then attach the newly created policy to our requisite kube cluster worker roles. . .
Regarding the proposed/requested AWS IAM Policy file and what I think the referenced requirements document could use. . . something like this below would be great (just threw together as example--not sure this is correct)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "kubeIngressAwsControllerWorkerNodesPolicy",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"iam:ListServerCertificates",
"cloudformation:Delete*",
"autoscaling:DescribeAutoScalingGroups",
"acm:ListCertificates",
"cloudformation:List*",
"ec2:DescribeRouteTables",
"autoscaling:AttachLoadBalancers",
"iam:GetServerCertificate",
"autoscaling:DetachLoadBalancers",
"elasticloadbalancing:*",
"cloudformation:Create*",
"autoscaling:DetachLoadBalancerTargetGroups",
"ec2:DescribeSecurityGroups",
"cloudformation:Describe*",
"autoscaling:AttachLoadBalancerTargetGroups",
"acm:DescribeCertificate",
"ec2:DescribeVpcs",
"ec2:DescribeSubnets",
"cloudformation:Get*"
],
"Resource": "*"
}
]
}
Hopefully this is within scope as this would really help us to evaluate (and hopefully implement) this project.
Thank you, And thanks for @sszuecs for being available and responsive in Slack, etc.
Some additional snags I hit if these look familiar please let me know.
Curious if someone could let me know if we need additional tags or a specific name for the cloudformation created controller SG the docs list as a prerequisite (link noted at top initial post).
I think I've created the requisite SG, but the pod cannot find it:
alb_ingress_pod=$(kk get po | grep ingress | awk '{print $1}') && kk logs $alb_ingress_pod
2017/12/06 21:32:52 starting /bin/kube-ingress-aws-controller
2017/12/06 21:32:52 required security group was not found
excerpt from awless showing security group created from the prereq's doc page using CF:
awless ls securitygroups
| ID ▲ | VPC | INBOUND | OUTBOUND | NAME | DESCRIPTION |
|-------------|--------------|-------------------------------------|-------------------------------------|-------------------------------------|-------------------------------------|
| sg-ff123456 | vpc-abc12345 | [0.0.0.0/0](tcp:443) | [0.0.0.0/0](any) | zalando-incubator-kube-ingress-aws- | zalando-incubator-kube-ingress-aws- |
| | | [0.0.0.0/0](tcp:80) | | controller-lb-sg- | controller-lb-sg |
| | | | | IngressLoadBalancerSecurityGroup- | |
| | | | | 4A8ZY7E8R40I | |
I've created what I think to be a requisite IAM policy and attached that policy to the kube cluster woker role.
So, I know there are issues thus far. Additionally, if this looks familiar, hitting an issue when trying to test skipper (will check to see if there is a min kube version testing on 1.7.4):
kubectl apply -f deploy/skipper.yaml
error: error validating "deploy/skipper.yaml": error validating data: [ValidationError(DaemonSet.spec.template.metadata): unknown field "containers" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(DaemonSet.spec.template.metadata): unknown field "hostNetwork" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta]; if you choose to ignore these errors, turn validation off with --validate=false
kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T19:11:22Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.4+coreos.0", GitCommit:"4bb697e04f7c356347aee6ffaa91640b428976d5", GitTreeState:"clean", BuildDate:"2017-08-22T08:43:47Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Thanks
@cmcconnell1 this sounds totally valid. I tried to extract from our current config the relevant parts, I hope this helps:
"IngressControllerIAMRole": {
"Properties": {
"AssumeRolePolicyDocument": {
"Statement": [
{
"Action": [
"sts:AssumeRole"
],
"Effect": "Allow",
"Principal": {
"Service": [
"ec2.amazonaws.com"
]
}
},
{
"Action": [
"sts:AssumeRole"
],
"Effect": "Allow",
"Principal": {
"AWS": {
"Fn::Join": [
"",
[
"arn:aws:iam::170858875137:role/",
{
"Ref": "WorkerIAMRole"
}
]
]
}
}
}
],
"Version": "2012-10-17"
},
"Path": "/",
"Policies": [
{
"PolicyDocument": {
"Statement": [
{
"Action": "acm:ListCertificates",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "acm:DescribeCertificate",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "autoscaling:DescribeAutoScalingGroups",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "autoscaling:AttachLoadBalancers",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "autoscaling:DetachLoadBalancers",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "autoscaling:DetachLoadBalancerTargetGroups",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "autoscaling:AttachLoadBalancerTargetGroups",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "cloudformation:*",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "elasticloadbalancing:*",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "elasticloadbalancingv2:*",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "ec2:DescribeInstances",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "ec2:DescribeSubnets",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "ec2:DescribeSecurityGroups",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "ec2:DescribeRouteTables",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "ec2:DescribeVpcs",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "iam:GetServerCertificate",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "iam:ListServerCertificates",
"Effect": "Allow",
"Resource": "*"
}
],
"Version": "2012-10-17"
},
"PolicyName": "root"
}
],
"RoleName": "foo-1-app-ingr-ctrl"
},
"Type": "AWS::IAM::Role"
},
"IngressLoadBalancerSecurityGroup": {
"Properties": {
"GroupDescription": {
"Ref": "AWS::StackName"
},
"SecurityGroupIngress": [
{
"CidrIp": "0.0.0.0/0",
"FromPort": 80,
"IpProtocol": "tcp",
"ToPort": 80
},
{
"CidrIp": "0.0.0.0/0",
"FromPort": 443,
"IpProtocol": "tcp",
"ToPort": 443
}
],
"Tags": [
{
"Key": "kubernetes.io/cluster/foo2",
"Value": "owned"
},
{
"Key": "kubernetes:application",
"Value": "kube-ingress-aws-controller"
}
],
"VpcId": "vpc-eecc9787"
},
"Type": "AWS::EC2::SecurityGroup"
},
you have to add the following tags to your AWS Loadbalancer SecurityGroup before updating:
-
kubernetes:application=kube-ingress-aws-controller
-
kubernetes.io/cluster/<cluster-id>=owned
The same is shown in the comment before.
Hi @szuecs the SG that I created does have those as well as others, here's the complete list of Key/Values for the CF created SG:
key Value
Name
aws:cloudformation:logical-id IngressLoadBalancerSecurityGroup
aws:cloudformation:stack-id arn:aws:cloudformation:us-west-1:123456789:stack/zalando-incubator-kube-ingress-aws-controller-lb-sg/884200e0-dacc-11e7-b8bf-50fae8e73cad
aws:cloudformation:stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg
kubernetes.io/cluster/opsdev owned
kubernetes:application kube-ingress-aws-controller
Does that look correct? Thanks
@cmcconnell1 make sure you also have this tag kubernetes.io/cluster/opsdev=owned
on the node where the ingress-controller is running. This is how it discovers the clusterID before looking up the SG.
We also support the old tag KubernetesCluster=<clusterid>
which is for instances in place for clusters deployed via kops.
OK, thanks @mikkeloscar great catch!
on the worker nodes from that test cluster I currently have
kubernetes.io/cluster/opsdev true
So the tag value is set to "true" and not "owned"
I think that value gets set by kube-aws, but will need to track down. Will do some more digging on this.
Thanks!
Excellent, so instead of changing the above mentioned tag (kubernetes.io/cluster/$cluster_id), as @mikkeloscar noted above, using the now legacy tag/values:
KubernetesCluster opsdev
did the trick!
alb_ingress_pod=$(kk get po | grep ingress | awk '{print $1}') && kk logs $alb_ingress_pod
2017/12/06 23:39:56 starting /bin/kube-ingress-aws-controller
2017/12/06 23:39:59 controller manifest:
2017/12/06 23:39:59 kubernetes API server:
2017/12/06 23:39:59 Cluster ID: opsdev
2017/12/06 23:39:59 vpc id: vpc-abc12345
2017/12/06 23:39:59 instance id: i-0bc22c623456a36d5
2017/12/06 23:39:59 auto scaling group name: opsdev-Nodepool1a-1RZ8Z2DRHD0RB-Workers-8HDQKVTYW7VP
2017/12/06 23:39:59 security group id: sg-xxxxx
2017/12/06 23:39:59 private subnet ids: xxxx
2017/12/06 23:39:59 public subnet ids: xxxx
2017/12/06 23:39:59 Start polling sleep 30s
2017/12/06 23:40:29 Found 0 ingresses
2017/12/06 23:40:29 Found 0 stacks
2017/12/06 23:40:29 Have 0 models
2017/12/06 23:40:29 Start polling sleep 30s
Thanks!
In case this is useful for docs or others coming here, below are the steps I used based on the (modified as needed) files from your repo and using the above noted AWS IAM policy file:
aws iam create-policy --policy-name zalando-incubator-kube-ingress-aws --policy-document file:///zalando-incubator-kube-ingress-aws-controller/iam/workernodes-policy.json --description 'requisite policy to attach to each kube clusters kube2iam role i.e.: us-west-1-opsdevWorkerMR'
List your newly created IAM Policy and get its ARN
aws iam list-policies | grep -i zalando-incubator-kube-ingress-aws
Attach the policy to your cluster workers role using the generated AWS ARN in above step
aws iam attach-role-policy --policy-arn arn:aws:iam::012345678901:policy/zalando-incubator-kube-ingress-aws --role-name us-west-1-opsdevWorkerMR
Validate and list roles attached policies for our cluster worker role
aws iam list-attached-role-policies --role-name us-west-1-opsdevWorkerMR
Create the requisite CF stack for the controller LB SG
aws cloudformation create-stack --stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg --template-body file:///kube-ingress-aws-controller-create-sg-traffic-to-loadbalancer.yaml --tags Key=Description,Value=IngressLoadBalancerSecurityGroup
Describe stack if desired
aws cloudformation describe-stacks --stack-name zalando-incubator-kube-ingress-aws-controller-lb-sg
@cmcconnell1 would you like to create a doc PR for it? It would be awesome to get this into the docs to make it better.
@cmcconnell1 would you like to create a doc PR for it? It would be awesome to get this into the docs to make it better.
I would like to do that.
Would it be better to do a separate PR for the above requisite steps to deploy the kube-ingress-aws-controller as an addon (for other installed methods (kube-aws, etc.) and for existing clusters):
- create the AWS IAM policies and apply to kube worker role(s)
- create AWS CF SG
- perform the requisite kube node tagging, etc.
so that the alb-ingress can run (and then perhaps create another PR for whatever else we'll need to do for skipper)?
Or would it be better to wait until all issues are resolved and create one PR for all--as there are still blocking issues for me--current skipper error as noted above (reposting what I hit yesterday):
basename `pwd` && kubectl apply -f deploy/skipper.yaml
kube-ingress-aws-controller
error: error validating "deploy/skipper.yaml": error validating data: [ValidationError(DaemonSet.spec.template.metadata): unknown field "containers" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(DaemonSet.spec.template.metadata): unknown field "hostNetwork" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta]; if you choose to ignore these errors, turn validation off with --validate=false
Regarding skipper: I was not able to follow the readme doc from this repo on how to deploy skipper due to the above error. Note that to get skipper to actually deploy via kubectl without errors, I created skipper.yaml files from code excerpts I pulled out of the skipper repo readme 3-minutes-skipper-in-kubernetes-introduction section, as there were no example .yaml config files in the skipper repo, and the files in the kube-ingress-aws-controller didn't work for me per above error.
pwd
skipper
git remote -v
origin [email protected]:zalando/skipper.git (fetch)
origin [email protected]:zalando/skipper.git (push)
find . -type f -name \*.yaml
./.catwatch.yaml
./.zappr.yaml
./delivery.yaml
./glide.yaml
So I created these files from code excerpts from the readme in the skipper repo
ls skipper-*
skipper-demo-deployment.yaml skipper-demo-ing.yaml skipper-demo-svc.yaml skipper-ingress-ds.yaml
And was able to create them via kubectl without error(s).
Thanks
@cmcconnell1 #107 fixes the skipper daemonset yaml. I stripped too much and removed the POD spec key. For your suggestions regarding kube-aws and other cluster creations, if you know one of them you could just add a section for this one, such that there is a nice starting point. Contribute as much docs as you like, but it also does not have to be too much work for you. ;) The deploy steps make totally sense to me and thanks for your help!
@cmcconnell1 regarding skipper we have now https://opensource.zalando.com/skipper/kubernetes/ingress-controller/ maybe this helps. I am just reviewing GH issues in this project.