aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet
When trying to deploy the sample application here the controller is failing to create the load balancer with the following error:
{"level":"debug","ts":1663820503.429755,"logger":"controller-runtime.manager.events","msg":"Warning","object":{"kind":"Ingress","namespace":"ga ││ me-2048","name":"ingress-2048","uid":"f748a08a-08ba-4409-a2ca-71eff8262a75","apiVersion":"networking.k8s.io/v1","resourceVersion":"2315298"},"r ││ eason":"FailedBuildModel","message":"Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet"}
Followed this guide to deploy the aws load balancer controller. Using the iam policy linked in that doc. All my public subnets have the tags:
kubernetes.io/role/elb | 1
kubernetes.io/cluster/<clusterName> | owned
And all of my private subnets have the tags:
kubernetes.io/role/internal-elb | 1
kubernetes.io/cluster/<clusterName> | owned
What else can I check for?
Environment
- AWS Load Balancer controller version - 2.4.3
- Using EKS - Yes, 1.22
After I added this annotation explicitly listing out the subnets, it did the needful -
alb.ingress.kubernetes.io/subnets:
but its not ideal.
Check if you are able to query the subnets via the aws-cli:
aws ec2 describe-subnets --filters Name=tag:kubernetes.io/role/elb,Values=1 Name=tag:kubernetes.io/cluster/<clusterName>,Values=shared,owned
Also verify if the cluster name is configured correctly in the controller command line flags.
Thanks @kishorj , yes I did run that and it correctly returns the 3 public subnets in different azs. Also the clusterName is correctly configured in the controller command args.
@hiteshghia, please check the subnet vpc and the controller VPC are the same. You could add the vpc-id filter to the aws cli: Name=vpc-id,Values=vpc-xxxxx
could you also send the cloud-trail trace for the DescribeSubnets
calls in both cases:
- When you run the aws describe-subnets cli
- The call made by the lb controller You can email the details to k8s-alb-controller-triage AT amazon.com.
I ran into something similar today (trying to create an external NLB). Specifically, it looks like the controller may not actually default to external load balancers (default service.beta.kubernetes.io/aws-load-balancer-internal
to false
) as indicated in the documentation.
This did not work:
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
name: nginx-external
namespace: confluent-proxy
spec:
loadBalancerClass: service.k8s.aws/nlb
ports:
- name: kafka
port: 9092
protocol: TCP
targetPort: 9092
- name: https
port: 443
protocol: TCP
targetPort: 443
selector:
app.kubernetes.io/name: nginx
sessionAffinity: None
type: LoadBalancer
However, if I this annotation, it starts working:
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
(it also works if I explicitly set service.beta.kubernetes.io/aws-load-balancer-subnets
)
My subnets are all properly tagged, I believe (cluster name is justinlee-aws-privatelink
)
kubernetes.io/cluster/justinlee-aws-privatelink | owned
kubernetes.io/role/elb | 1
But looking at CloudTrail, it looks like the controller was looking for the kubernetes.io/role/internal-elb
tag vs. the regular external kubernetes.io/role/elb
tag:
"requestParameters": {
"subnetSet": {},
"filterSet": {
"items": [
{
"name": "tag:kubernetes.io/role/internal-elb",
"valueSet": {
"items": [
{},
{
"value": "1"
}
]
}
},
{
"name": "vpc-id",
"valueSet": {
"items": [
{
"value": "vpc-04baa86d522c4fea0"
}
]
}
}
]
}
},
@justinrlee, controller provisions internal NLB by default unless you specify the scheme via the annotation service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
.
Since you've specified the loadBalancerClass, the annotation service.beta.kubernetes.io/aws-load-balancer-type: external
has no effect.
Ah, thank you! Somehow I missed that documentation note.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
You basically need to tag subnets correctly (and you have to do it for at least 2 subnets, you'll see an error message when you describe service that will incorrectly say failed to discover at least 1 subnet. It has to discover at least 2.) https://aws.amazon.com/premiumsupport/knowledge-center/eks-load-balancer-controller-subnets/
If you manage VPC with Terraform TF is great about updating tags instantly.
public_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/cluster/temp" = "shared" # temp extra cluster "kubernetes.io/role/elb" = "1" } private_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/cluster/temp" = "shared" # temp extra cluster "kubernetes.io/role/internal-elb" = "1" }
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.