aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet

Open hiteshghia opened this issue 2 years ago • 4 comments

When trying to deploy the sample application here the controller is failing to create the load balancer with the following error:

{"level":"debug","ts":1663820503.429755,"logger":"controller-runtime.manager.events","msg":"Warning","object":{"kind":"Ingress","namespace":"ga ││ me-2048","name":"ingress-2048","uid":"f748a08a-08ba-4409-a2ca-71eff8262a75","apiVersion":"networking.k8s.io/v1","resourceVersion":"2315298"},"r ││ eason":"FailedBuildModel","message":"Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet"}

Followed this guide to deploy the aws load balancer controller. Using the iam policy linked in that doc. All my public subnets have the tags:

kubernetes.io/role/elb | 1
kubernetes.io/cluster/<clusterName> | owned

And all of my private subnets have the tags:

kubernetes.io/role/internal-elb | 1
kubernetes.io/cluster/<clusterName> | owned

What else can I check for?

Environment

  • AWS Load Balancer controller version - 2.4.3
  • Using EKS - Yes, 1.22

hiteshghia avatar Sep 22 '22 04:09 hiteshghia

After I added this annotation explicitly listing out the subnets, it did the needful - alb.ingress.kubernetes.io/subnets: but its not ideal.

hiteshghia avatar Sep 22 '22 05:09 hiteshghia

Check if you are able to query the subnets via the aws-cli:

aws ec2 describe-subnets --filters Name=tag:kubernetes.io/role/elb,Values=1 Name=tag:kubernetes.io/cluster/<clusterName>,Values=shared,owned

Also verify if the cluster name is configured correctly in the controller command line flags.

kishorj avatar Sep 22 '22 16:09 kishorj

Thanks @kishorj , yes I did run that and it correctly returns the 3 public subnets in different azs. Also the clusterName is correctly configured in the controller command args.

hiteshghia avatar Sep 22 '22 16:09 hiteshghia

@hiteshghia, please check the subnet vpc and the controller VPC are the same. You could add the vpc-id filter to the aws cli: Name=vpc-id,Values=vpc-xxxxx

could you also send the cloud-trail trace for the DescribeSubnets calls in both cases:

  • When you run the aws describe-subnets cli
  • The call made by the lb controller You can email the details to k8s-alb-controller-triage AT amazon.com.

kishorj avatar Sep 22 '22 21:09 kishorj

I ran into something similar today (trying to create an external NLB). Specifically, it looks like the controller may not actually default to external load balancers (default service.beta.kubernetes.io/aws-load-balancer-internal to false) as indicated in the documentation.

This did not work:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: external
  name: nginx-external
  namespace: confluent-proxy
spec:
  loadBalancerClass: service.k8s.aws/nlb
  ports:
  - name: kafka
    port: 9092
    protocol: TCP
    targetPort: 9092
  - name: https
    port: 443
    protocol: TCP
    targetPort: 443
  selector:
    app.kubernetes.io/name: nginx
  sessionAffinity: None
  type: LoadBalancer

However, if I this annotation, it starts working:

    service.beta.kubernetes.io/aws-load-balancer-internal: "false"

(it also works if I explicitly set service.beta.kubernetes.io/aws-load-balancer-subnets)

My subnets are all properly tagged, I believe (cluster name is justinlee-aws-privatelink)

kubernetes.io/cluster/justinlee-aws-privatelink | owned
kubernetes.io/role/elb | 1

But looking at CloudTrail, it looks like the controller was looking for the kubernetes.io/role/internal-elb tag vs. the regular external kubernetes.io/role/elb tag:

    "requestParameters": {
        "subnetSet": {},
        "filterSet": {
            "items": [
                {
                    "name": "tag:kubernetes.io/role/internal-elb",
                    "valueSet": {
                        "items": [
                            {},
                            {
                                "value": "1"
                            }
                        ]
                    }
                },
                {
                    "name": "vpc-id",
                    "valueSet": {
                        "items": [
                            {
                                "value": "vpc-04baa86d522c4fea0"
                            }
                        ]
                    }
                }
            ]
        }
    },

justinrlee avatar Nov 09 '22 16:11 justinrlee

@justinrlee, controller provisions internal NLB by default unless you specify the scheme via the annotation service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing.

Since you've specified the loadBalancerClass, the annotation service.beta.kubernetes.io/aws-load-balancer-type: external has no effect.

kishorj avatar Nov 09 '22 17:11 kishorj

Ah, thank you! Somehow I missed that documentation note.

justinrlee avatar Nov 09 '22 19:11 justinrlee

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 07 '23 19:02 k8s-triage-robot

You basically need to tag subnets correctly (and you have to do it for at least 2 subnets, you'll see an error message when you describe service that will incorrectly say failed to discover at least 1 subnet. It has to discover at least 2.) https://aws.amazon.com/premiumsupport/knowledge-center/eks-load-balancer-controller-subnets/

If you manage VPC with Terraform TF is great about updating tags instantly.

public_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/cluster/temp" = "shared" # temp extra cluster "kubernetes.io/role/elb" = "1" } private_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/cluster/temp" = "shared" # temp extra cluster "kubernetes.io/role/internal-elb" = "1" }

neoakris avatar Feb 24 '23 16:02 neoakris

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 26 '23 17:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Apr 25 '23 17:04 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 25 '23 17:04 k8s-ci-robot