aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Listener and listener-rule AWS tagging records leak when deleting an ALB
We have a simple non-shared ALB per ingress system, and recently noticed when cleaning up an old cluster (with ALBC v2.3.0) that a number of minor AWS resources had leaked. E.g. if you run the below (with a suitable replacement for myclustername) you can see all the LB resources for a cluster created by this controller:
aws resourcegroupstaggingapi get-resources --tag-filters "Key=elbv2.k8s.aws/cluster,Values=myclustername" --query ResourceTagMappingList[].ResourceARN
If you then delete an ingress that originally created an ALBs (i.e. not just a TargetGroup via the CRD), the "loadbalancer" and "targetgroup" resources are deleted, but not the "listener" or "listener-rule" ones.
The controller pod logs show creation log lines for the listener/rule resource but no matching deletion log entry unlike for loadbalancer/targetgroup. No other errors visible apart from a "Reconciler error" that it can't find the load-balancer (that it deleted a few millisecs ago so might be expected!)
Environment
- AWS Load Balancer controller version: 2.3.0 but I also saw this with a test cluster on 2.4.1
- Kubernetes version: 1.20
- Using EKS: yes, latest 1.20 platform version
Checked our environment after reading this and can confirm the same behavior
Environment:
- AWS Load Balancer controller version: 2.4.1
- Kubernetes version: 1.20
- Using EKS: yes, latest 1.20 platform version eks.3
@fargue @tyrken We only deletes the LoadBalancer when deleting the Ingresses. The Listener/ListenerRules should automatically be deleted when the LoadBalancer is deleted.
I suspect it's eventually consistency issue with resourcegroupstaggingapi, can you confirm whether these resources eventually go away or stuck in your account?
if it's stuck, could you help run aws elbv2 describe-listeners for the ARN returned?
According to AWS support the listener/listener-rule resources themselves are deleted when the loadbalancer is, but the resourcegrouptaggingapi records get left behind. They don't ever seem to get cleared up - at least we have some that are at a guess months old.
aws elbv2 describe-listeners fails for the listener resource ARNs found. Interestingly you can aws elbv2 delete-rule one of the listener-rule ARNs repeatedly without error. Doing so does clean up the resourcegroupstaggingapi record for that resource type but nothing gets through for the listener ones.
This appears to be an issue on the resource tagging apis. We are following up with the ELB team for further investigation.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue or PR with
/reopen - Mark this issue or PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue or PR with
/reopen- Mark this issue or PR as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.