aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
ALBC unexpectedly builds an empty model
Describe the bug
Despite Ingress resource presence, ALBC builds empty model:
{"level":"info","ts":1660227644.8071668,"logger":"controllers.ingress","msg":"successfully built model","model":"{\"id\":\"nonprod-5-eks-nginx\",\"resources\":{}}"}
and as a result attempts to delete the target group.
Steps to reproduce Prepare a EKS cluster, create IAM role for ALBC (without ServiceAccount in k8s) and target group. Then deploy https://github.com/igor-mendix/ingress like this (beware of hardcoded ingress namespace):
helm upgrade -i ingress ./ -n ingress --create-namespace \
--set awslbc.clusterName=<your cluster name> \
--set awslbc.serviceAccount.create=true \
--set awslbc.serviceAccount.name=aws-load-balancer-controller \
--set awslbc.serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn=<IAM role ARN> \
--set targetGroupARN=<target group ARN>
Expected outcome
ALBC picking up Ingress, generating non-empty model and registering targets in the target group.
Environment
- Chart version: 1.4.2
- EKS 1.22 (cluster spawned with Terraform)
Additional Context:
The relevant Ingress resource is present, is in the correct namespace and has the correct ingressClassName. I was thinking maybe there's some lack of access on the related service account or some other access issue, so I spawned another container with ubuntu in the same pod (as described here) and was able to get the Ingress resources just fine (meaning all must be fine for the ALBC itself too).
Sorry if I missed something obvious, but I'm just stuck really.
ALBC pod logs
{"level":"info","ts":1660295729.0258105,"msg":"version","GitVersion":"v2.4.2","GitCommit":"77370be7f8e13787a3ec0cfa99de1647010f1055","BuildDate":"2022-05-24T22:33:27+0000"}
{"level":"info","ts":1660295729.0475886,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1660295729.0504375,"logger":"setup","msg":"adding health check for controller"}
{"level":"info","ts":1660295729.0505106,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":1660295729.0506046,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1660295729.0506785,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1660295729.050755,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
{"level":"info","ts":1660295729.050821,"logger":"setup","msg":"starting podInfo repo"}
I0812 09:15:31.051413 15 leaderelection.go:243] attempting to acquire leader lease ingress/aws-load-balancer-controller-leader...
{"level":"info","ts":1660295731.051512,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1660295731.0516114,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
{"level":"info","ts":1660295731.051983,"logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":1660295731.0520697,"logger":"controller-runtime.webhook","msg":"serving webhook server","host":"","port":9443}
{"level":"info","ts":1660295731.0524106,"logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
I0812 09:15:46.337187 15 leaderelection.go:253] successfully acquired lease ingress/aws-load-balancer-controller-leader
{"level":"info","ts":1660295746.3374712,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting EventSource","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3375266,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting EventSource","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.337534,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting EventSource","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.33754,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting EventSource","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3375454,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting Controller","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding"}
{"level":"debug","ts":1660295746.3373365,"logger":"controller-runtime.manager.events","msg":"Normal","object":{"kind":"ConfigMap","namespace":"ingress","name":"aws-load-balancer-controller-leader","uid":"6f65362a-b081-4a2c-b73d-86f137ada0a4","apiVersion":"v1","resourceVersion":"869485"},"reason":"LeaderElection","message":"ingress-awslbc-76c58f6d88-lvsvh_964f8718-8e1f-416f-ae76-6b9d58beee79 became leader"}
{"level":"info","ts":1660295746.3377564,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"channel source: 0xc0002e8b90"}
{"level":"info","ts":1660295746.3378165,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"channel source: 0xc0002e8be0"}
{"level":"info","ts":1660295746.337826,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3378372,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3378444,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"channel source: 0xc0002e8c30"}
{"level":"info","ts":1660295746.337853,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"channel source: 0xc0002e8f00"}
{"level":"info","ts":1660295746.3378615,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3378675,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.3378747,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting Controller"}
{"level":"info","ts":1660295746.3379805,"logger":"controller-runtime.manager.controller.service","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1660295746.337999,"logger":"controller-runtime.manager.controller.service","msg":"Starting Controller"}
{"level":"debug","ts":1660295746.3383372,"logger":"controllers.ingress.eventHandlers.ingressClass","msg":"enqueue ingress for ingressClass event","ingressClass":"alb","ingress":{"namespace":"ingress","name":"ingress"}}
{"level":"debug","ts":1660295746.338426,"logger":"controllers.ingress.eventHandlers.ingress","msg":"enqueue ingressGroup for ingress event","ingress":"ingress/ingress","ingressGroup":"nonprod-5-eks-nginx"}
{"level":"debug","ts":1660295746.3385406,"logger":"controllers.ingress.eventHandlers.service","msg":"enqueue ingress for service event","service":{"namespace":"ingress","name":"nginx-controller"},"ingress":{"namespace":"ingress","name":"ingress"}}
{"level":"debug","ts":1660295746.338592,"logger":"controllers.ingress.eventHandlers.ingress","msg":"enqueue ingressGroup for ingress event","ingress":"ingress/ingress","ingressGroup":"nonprod-5-eks-nginx"}
{"level":"debug","ts":1660295746.338613,"logger":"controllers.ingress.eventHandlers.ingress","msg":"enqueue ingressGroup for ingress event","ingress":"ingress/ingress","ingressGroup":"nonprod-5-eks-nginx"}
{"level":"info","ts":1660295746.4379313,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Starting workers","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","worker count":3}
{"level":"info","ts":1660295746.4380546,"logger":"controller-runtime.manager.controller.service","msg":"Starting workers","worker count":3}
{"level":"info","ts":1660295746.4384162,"logger":"controller-runtime.manager.controller.ingress","msg":"Starting workers","worker count":3}
{"level":"info","ts":1660295746.4385622,"logger":"controllers.ingress","msg":"successfully built model","model":"{\"id\":\"nonprod-5-eks-nginx\",\"resources\":{}}"}
{"level":"debug","ts":1660295746.4385846,"logger":"controllers.ingress.eventHandlers.ingressClassParams","msg":"enqueue ingressClass for ingressClassParams event","ingressClassParams":"alb","ingressClass":"alb"}
{"level":"debug","ts":1660295746.4386365,"logger":"controllers.ingress.eventHandlers.ingressClass","msg":"enqueue ingress for ingressClass event","ingressClass":"alb","ingress":{"namespace":"ingress","name":"ingress"}}
{"level":"debug","ts":1660295746.4386818,"logger":"controllers.ingress.eventHandlers.ingress","msg":"enqueue ingressGroup for ingress event","ingress":"ingress/ingress","ingressGroup":"nonprod-5-eks-nginx"}
{"level":"info","ts":1660295747.3786626,"logger":"controllers.ingress","msg":"deleting targetGroup","arn":"arn:aws:elasticloadbalancing:eu-west-1:<cut>:targetgroup/nonprod-5-eks-nginx-controller/8c049997d02979f1"}
{"level":"error","ts":1660295767.5689254,"logger":"controller-runtime.manager.controller.ingress","msg":"Reconciler error","name":"nonprod-5-eks-nginx","namespace":"","error":"failed to delete targetGroup: timed out waiting for the condition"}
{"level":"info","ts":1660295767.5690646,"logger":"controllers.ingress","msg":"successfully built model","model":"{\"id\":\"nonprod-5-eks-nginx\",\"resources\":{}}"}
{"level":"info","ts":1660295767.9722342,"logger":"controllers.ingress","msg":"deleting targetGroup","arn":"arn:aws:elasticloadbalancing:eu-west-1:<cut>:targetgroup/nonprod-5-eks-nginx-controller/8c049997d02979f1"}
It also deletes my targetGroupBinding if I try to re-add it
{"level":"info","ts":1660296741.5694022,"logger":"controllers.ingress","msg":"successfully built model","model":"{\"id\":\"nonprod-5-eks-nginx\",\"resources\":{}}"}
{"level":"info","ts":1660296742.049092,"logger":"controllers.ingress","msg":"deleting targetGroupBinding","targetGroupBinding":{"namespace":"ingress","name":"nonprod-5-eks-nginx-controller"}}
{"level":"info","ts":1660296742.0934792,"msg":"deRegistering targets","arn":"arn:aws:elasticloadbalancing:eu-west-1:<cut>:targetgroup/nonprod-5-eks-nginx-controller/8c049997d02979f1","targets":[{"AvailabilityZone":"eu-west-1b","Id":"10.11.154.229","Port":80}]}
{"level":"info","ts":1660296742.133965,"msg":"deRegistered targets","arn":"arn:aws:elasticloadbalancing:eu-west-1:<cut>:targetgroup/nonprod-5-eks-nginx-controller/8c049997d02979f1"}
{"level":"info","ts":1660296742.2581291,"logger":"controllers.ingress","msg":"deleted targetGroupBinding","targetGroupBinding":{"namespace":"ingress","name":"nonprod-5-eks-nginx-controller"}}
`kubectl get ing -o yaml` from within the pod (mentioned ubuntu sidecar)
root@ingress-awslbc-65bcf99565-l76c7:/# kubectl get ing -o yaml
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/group.name: nonprod-5-eks-nginx
alb.ingress.kubernetes.io/healthcheck-path: /healthz
alb.ingress.kubernetes.io/load-balancer-name: nonprod-5-eks-alb
alb.ingress.kubernetes.io/scheme: internal
alb.ingress.kubernetes.io/target-type: ip
meta.helm.sh/release-name: ingress
meta.helm.sh/release-namespace: ingress
creationTimestamp: "2022-08-11T09:16:58Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2022-08-11T09:18:15Z"
finalizers:
- group.ingress.k8s.aws/nonprod-5-eks-nginx
generation: 2
labels:
app.kubernetes.io/instance: ingress
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: ingress
app.kubernetes.io/version: 0.3.0
helm.sh/chart: ingress-0.3.0
name: ingress
namespace: ingress
resourceVersion: "573700"
uid: 49d44c72-5dc5-4b1c-bd2d-e5b703464859
spec:
ingressClassName: alb
rules:
- http:
paths:
- backend:
service:
name: nginx-controller
port:
number: 80
path: /
pathType: Prefix
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
`ps faux` from within the pod (ubuntu sidecar)
root@ingress-awslbc-65bcf99565-l76c7:/# ps faux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
65532 15 0.2 0.9 1381208 72504 ? Ssl 09:30 0:02 /controller --cluster-name=nonprod-5-eks --ingress-class=alb --watch-namespace=ingress
root 7 0.0 0.0 4620 3760 pts/0 Ss 09:30 0:00 bash
root 2699 0.0 0.0 7056 1588 pts/0 R+ 09:48 0:00 \_ ps faux
65535 1 0.0 0.0 968 4 ? Ss 09:30 0:00 /pause
@igor-mendix, empty model implies the ingress got deleted or the ingress group, if configured, has no active members since the ingresses got deleted. Your ingress spec contains nonzero deletion timestamp:
deletionTimestamp: "2022-08-11T09:18:15Z"
Could you make sure you did not delete the ingress? The ingress resource stays around until the controller is able to clean up all of the underlying aws resources, and remove the finalizer. It is possible the cleanup could have failed or in progress.
Oooh, thanks a lot! This is indeed a remnant of previous Helm uninstall. I've just checked and it is not getting deleted when I uninstall the chart even though ALBC itself is removed (and pretty much everything with it). So it gets stuck forever. So finalizer there is this:
group.ingress.k8s.aws/nonprod-5-eks-nginx
but a lot of things in this cluster are set up by Terraform separately, so I guess whatever in AWS it depends on would not get deleted. As I understand, this finalizer is added by ALBC. Is there a way to prevent it from doing this maybe?
@igor-mendix, the ALBC adds the finalizer to ensure all underlying AWS resources get cleaned up successfully on ingress deletion. You need to ensure the controller chart does not get uninstalled before the ingresses get deleted.
It's all part of a single chart (ALBC is a dependency) and I don't know if there's a way to control the order in which stuff is deleted during uninstall. Will have to dig a bit deeper.
My experiments lead me to this: if I delete last Ingress resource, ALBC tries to remove the target group it is pointing at. But this TG is created outside k8s (and is attached to ALB), so it cannot be deleted, which results in the resource hanging in this "deleted" state indefinitely. Is there a way to tell ALBC not to touch TG?
@igor-mendix, unless you manually create TargetGroupBinding to bind to an out-of-band TG, controller assumes ownership of the ALB/target groups and delete them when there are no more ingresses in the group.
With the help of someone on Slack I figured that if I only want ALBC to control targets on the TG I don't need Ingress resource at all. Ingress dictates ALBC how to configure ALB and attached TG. Unfortunately, this difference between Ingress and TargetGroupBinding is not clear from the documentation and neither it is obvious from observing kubectl outputs or entities in AWS.
Would be nice if clear description of what are the roles of Ingress and TargetGroupBinding and how their presence translates into AWS changes/resources was added to the documentation.