aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

Issue with Reusing Pre-existing NLB Created via Terraform in Kubernetes Service

Open wbar opened this issue 1 year ago • 6 comments

Description:

I am encountering an issue when trying to integrate a Kubernetes Service with a pre-existing Network Load Balancer (NLB) that was created via Terraform. Despite correctly tagging the NLB and configuring the Service with the necessary annotations, I receive an error indicating a conflict due to the NLB having "the same name but with different settings". This seems related to the management of Security Groups by the AWS Load Balancer Controller, which has been introduced in a recent feature update for Network Load Balancers.

Environment:

  • Kubernetes version: 1.28
  • AWS Load Balancer Controller version: 2.6.1, 2.7.1
  • Cloud provider or hardware configuration: AWS

Steps to Reproduce:

  1. Create an NLB using Terraform with specific tags:
    ingress.k8s.aws/stack	XXXXXXXX
    elbv2.k8s.aws/cluster
    ingress.k8s.aws/resource
    
  2. Configure a Kubernetes Service with annotations to use the pre-created NLB and specify the load balancer settings,
     service.beta.kubernetes.io/aws-load-balancer-alpn-policy: HTTP2Preferred
     service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
     service.beta.kubernetes.io/aws-load-balancer-ip-address-type: ipv4
     service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: 'false'
     service.beta.kubernetes.io/aws-load-balancer-name: XXXXXXX
     service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
     service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
     service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
     service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:ZZZZZZZ
     service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: SOME_ATTRS 
     service.beta.kubernetes.io/aws-load-balancer-type: external
    
  3. Attempt to deploy the Service in a new environment.

Expected Result: The Kubernetes Service should successfully associate with the pre-created NLB without any conflicts regarding the load balancer's name or settings.

Actual Result: Received an error message:

A load balancer with the same name 'XXXXXXX' exists, but with different settings.

This suggests an issue with how the AWS Load Balancer Controller handles existing NLBs, particularly regarding Security Groups settings.

Additional Information:

  • The error persists even after setting service.beta.kubernetes.io/aws-load-balancer-security-groups: "" in an attempt to bypass automatic Security Group management.
  • Here is a snippet from the AWS Load Balancer Controller logs indicating the attempted settings:
    // resources["AWS::ElasticLoadBalancingV2::LoadBalancer"]
    {
        "LoadBalancer": {
            "spec": {
                "name": "XXXXXXX",
                "type": "network",
                "scheme": "internet-facing",
                "ipAddressType": "ipv4",
                "subnetMapping": [
                    {
                        "subnetID": "subnet-aaaaaaaaa"
                    },
                    {
                        "subnetID": "subnet-bbbbbbb"
                    },
                    {
                        "subnetID": "subnet-ggggggg"
                    }
                ],
                "securityGroups": [
                    {
                        "$ref": "#/resources/AWS::EC2::SecurityGroup/ManagedLBSecurityGroup/status/groupID"
                    },
                    "sg-TTTTTTTTTTTT"
                ]
            }
        }
    }
    

wbar avatar Feb 13 '24 17:02 wbar

Everything works on version 1.5.5

wbar avatar Feb 13 '24 17:02 wbar

We do not support using existing NLB for service now. But we are have a feature soon, You can track it here : https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/228

TargetGroupBlinding supports Existing ALB/NLB, can you take a look to see if it helps with your case: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/guide/targetgroupbinding/targetgroupbinding/

wweiwei-li avatar Feb 14 '24 23:02 wweiwei-li

Thanks for a response.

You have been supporting this till version 1.5.5.

Now, when AWS introduced Security Groups for NLB and you changed you Controller to use this feature, you are totally ignoring:

service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: 'false'

Even adding this:

service.beta.kubernetes.io/aws-load-balancer-security-groups: ''

wont help because AWS Controller is building model with security groups:

// resources["AWS::ElasticLoadBalancingV2::LoadBalancer"]
{
    "LoadBalancer": {
        "spec": {
            "name": "XXXXXXX",
            "type": "network",
            "scheme": "internet-facing",
            "ipAddressType": "ipv4",
            "subnetMapping": [
                {
                    "subnetID": "subnet-aaaaaaaaa"
                },
                {
                    "subnetID": "subnet-bbbbbbb"
                },
                {
                    "subnetID": "subnet-ggggggg"
                }
            ],
            "securityGroups": [
                {
                    "$ref": "#/resources/AWS::EC2::SecurityGroup/ManagedLBSecurityGroup/status/groupID"
                },
                "sg-TTTTTTTTTTTT"
            ]
        }
    }
}

Can you on given service.beta.kubernetes.io/aws-load-balancer-security-groups: '' not add this SGs to the model ? Maybe that will not be causing an error:

A load balancer with the same name 'XXXXXXX' exists, but with different settings.

wbar avatar Feb 15 '24 00:02 wbar

@wbar, by 1.5.5 I suppose you mean helm chart version. From controller version v2.6.0, we support SGs for NLB and the controller will create front-end and back-end SGs and attach to the NLBs by default. If you want to opt-out, you can specify the feature gate flag --feature-gates=NLBSecurityGroup=false, or --set controllerConfig.featureGates.NLBSecurityGroup=false in helm cmd. see https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/deploy/configurations/#feature-gates

oliviassss avatar Feb 16 '24 00:02 oliviassss

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 16 '24 00:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jun 15 '24 00:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Jul 15 '24 00:07 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 15 '24 00:07 k8s-ci-robot

@wbar, by 1.5.5 I suppose you mean helm chart version. From controller version v2.6.0, we support SGs for NLB and the controller will create front-end and back-end SGs and attach to the NLBs by default. If you want to opt-out, you can specify the feature gate flag --feature-gates=NLBSecurityGroup=false, or --set controllerConfig.featureGates.NLBSecurityGroup=false in helm cmd. see https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/deploy/configurations/#feature-gates

It worked! Thanks :)

wbar avatar Dec 04 '24 15:12 wbar