aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Not Working!! restrict CIDR IP addresses for a LoadBalancer type
Describe the bug
A concise description of what the bug is.
I am using NLB with nlb-target-type: ip
, I want to whitelist IP using loadBalancerSourceRanges
but it seems there some issue with the AWS Load Balancer Controller as this feature is not working but it works for nlb-target-type: instance
.
Make no mistake, I am also setting preserve_client_ip.enabled=true
as per documentations and I could see the it is enabled on target group and SG Rule with whitelist IP range gets added successfully but I failed to access the application. I am currently whitelisting my ISP's public IP for testing which works fine when nlb-target-type: instance
.
Steps to reproduce You can run the sample Windows Container based k8s config below.
Expected outcome
After successful apply, I should be only able to access the application from 110.226.221.152/32
IP.
- AWS Load Balancer controller version: 2.3.0
- Kubernetes version 1.21
- Using EKS (yes/no), if so version? Yes 1.21
Additional Context:
apiVersion: apps/v1
kind: Deployment
metadata:
name: windows-server-iis
namespace: default
spec:
selector:
matchLabels:
app: windows-server-iis
tier: backend
track: stable
replicas: 1
template:
metadata:
labels:
app: windows-server-iis
tier: backend
track: stable
spec:
containers:
- name: windows-server-iis
image: mcr.microsoft.com/windows/servercore:1809
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
command:
- powershell.exe
- -command
- "Add-WindowsFeature Web-Server; Invoke-WebRequest -UseBasicParsing -Uri 'https://dotnetbinaries.blob.core.windows.net/servicemonitor/2.0.1.6/ServiceMonitor.exe' -OutFile 'C:\\ServiceMonitor.exe'; echo '<html><body><br/><br/><marquee><H1>Hello EKS!!!<H1><marquee></body><html>' > C:\\inetpub\\wwwroot\\default.html; C:\\ServiceMonitor.exe 'w3svc'; "
nodeSelector:
kubernetes.io/os: windows
---
apiVersion: v1
kind: Service
metadata:
name: windows-server-iis-service
namespace: default
annotations:
# service.beta.kubernetes.io/aws-load-balancer-type: "external"
# service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "instance"
# service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "80"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-name: test-iis-server-lb
service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: windows-server-iis
tier: backend
track: stable
# sessionAffinity: None
loadBalancerSourceRanges:
- "110.226.221.152/32"
type: LoadBalancer
AWS Load Balancer Controller Config
Name: aws-load-balancer-controller
Namespace: kube-system
CreationTimestamp: Thu, 04 Nov 2021 12:31:15 +0000
Labels: app.kubernetes.io/instance=aws-load-balancer-controller
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=aws-load-balancer-controller
app.kubernetes.io/version=v2.3.0
helm.sh/chart=aws-load-balancer-controller-1.3.2
Annotations: deployment.kubernetes.io/revision: 1
meta.helm.sh/release-name: aws-load-balancer-controller
meta.helm.sh/release-namespace: kube-system
Selector: app.kubernetes.io/instance=aws-load-balancer-controller,app.kubernetes.io/name=aws-load-balancer-controller
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/instance=aws-load-balancer-controller
app.kubernetes.io/name=aws-load-balancer-controller
Annotations: prometheus.io/port: 8080
prometheus.io/scrape: true
Service Account: aws-load-balancer-controller
Containers:
aws-load-balancer-controller:
Image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.3.0
Ports: 9443/TCP, 8080/TCP
Host Ports: 0/TCP, 0/TCP
Command:
/controller
Args:
--cluster-name=my-test-cluster
--ingress-class=alb
--disable-restricted-sg-rules=true
Liveness: http-get http://:61779/healthz delay=30s timeout=10s period=10s #success=1 #failure=2
Environment: <none>
Mounts:
/tmp/k8s-webhook-server/serving-certs from cert (ro)
Volumes:
cert:
Type: Secret (a volume populated by a Secret)
SecretName: aws-load-balancer-tls
Optional: false
Priority Class Name: system-cluster-critical
Instructions followed from here : https://aws.amazon.com/premiumsupport/knowledge-center/eks-cidr-ip-address-loadbalancer/ Please confirm whether this is a bug or I am missing something here.
Hi @kishorj any chance looking into this? I have done exactly what documentation says.
@vmasule, could you email the controller logs to k8s-alb-controller-triage AT amazon.com? or share the model generated by the controller for your service?
@kishorj Below is the controller logs generated for exact above configuration except IPS IP is diff this time, I can also see in SG there are rules added for this and one of them is whitelisting ISP IP, also enabled NLB logs and I can't see any logs generated when I try to access the service using NLB DNS.
Can you please look into this, as this is bit urgent and we have committed to provide this feature based on reading controller document but it's not working for some reason. Let me know if you need more info.
Or do you have any reference to working example with aws-load-balancer-nlb-target-type: ip
for internet facing NLB?
{"level":"info","ts":1640517944.1814473,"logger":"controllers.service","msg":"successfully built model","model":"{\"id\":\"default/windows-server-iis-service\",\"resources\":{\"AWS::ElasticLoadBalancingV2::Listener\":{\"80\":{\"spec\":{\"loadBalancerARN\":{\"$ref\":\"#/resources/AWS::ElasticLoadBalancingV2::LoadBalancer/LoadBalancer/status/loadBalancerARN\"},\"port\":80,\"protocol\":\"TCP\",\"defaultActions\":[{\"type\":\"forward\",\"forwardConfig\":{\"targetGroups\":[{\"targetGroupARN\":{\"$ref\":\"#/resources/AWS::ElasticLoadBalancingV2::TargetGroup/default/windows-server-iis-service:80/status/targetGroupARN\"}}]}}]}}},\"AWS::ElasticLoadBalancingV2::LoadBalancer\":{\"LoadBalancer\":{\"spec\":{\"name\":\"k8s-default-windowss-00aaf49d08\",\"type\":\"network\",\"scheme\":\"internet-facing\",\"ipAddressType\":\"ipv4\",\"subnetMapping\":[{\"subnetID\":\"subnet-051e675f909a22a0f\"},{\"subnetID\":\"subnet-06808dc6a20ca0dcb\"}]}}},\"AWS::ElasticLoadBalancingV2::TargetGroup\":{\"default/windows-server-iis-service:80\":{\"spec\":{\"name\":\"k8s-default-windowss-74a8b74c3a\",\"targetType\":\"ip\",\"port\":80,\"protocol\":\"TCP\",\"ipAddressType\":\"ipv4\",\"healthCheckConfig\":{\"port\":\"traffic-port\",\"protocol\":\"TCP\",\"intervalSeconds\":10,\"healthyThresholdCount\":3,\"unhealthyThresholdCount\":3},\"targetGroupAttributes\":[{\"key\":\"preserve_client_ip.enabled\",\"value\":\"true\"},{\"key\":\"proxy_protocol_v2.enabled\",\"value\":\"false\"}]}}},\"K8S::ElasticLoadBalancingV2::TargetGroupBinding\":{\"default/windows-server-iis-service:80\":{\"spec\":{\"template\":{\"metadata\":{\"name\":\"k8s-default-windowss-74a8b74c3a\",\"namespace\":\"default\",\"creationTimestamp\":null},\"spec\":{\"targetGroupARN\":{\"$ref\":\"#/resources/AWS::ElasticLoadBalancingV2::TargetGroup/default/windows-server-iis-service:80/status/targetGroupARN\"},\"targetType\":\"ip\",\"serviceRef\":{\"name\":\"windows-server-iis-service\",\"port\":80},\"networking\":{\"ingress\":[{\"from\":[{\"ipBlock\":{\"cidr\":\"106.214.135.249/32\"}}],\"ports\":[{\"protocol\":\"TCP\",\"port\":80}]},{\"from\":[{\"ipBlock\":{\"cidr\":\"172.14.0.0/23\"}},{\"ipBlock\":{\"cidr\":\"172.12.2.0/23\"}}],\"ports\":[{\"protocol\":\"TCP\",\"port\":80}]}]},\"ipAddressType\":\"ipv4\"}}}}}}}"}
{"level":"info","ts":1640517944.7647452,"logger":"controllers.service","msg":"creating targetGroup","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80"}
{"level":"info","ts":1640517944.9572344,"logger":"controllers.service","msg":"created targetGroup","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:targetgroup/k8s-default-windowss-74a8b74c3a/8ba6196e3a53b4d5"}
{"level":"info","ts":1640517944.9648106,"logger":"controllers.service","msg":"modifying targetGroup attributes","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:targetgroup/k8s-default-windowss-74a8b74c3a/8ba6196e3a53b4d5","change":{"preserve_client_ip.enabled":"true"}}
{"level":"info","ts":1640517944.980313,"logger":"controllers.service","msg":"modified targetGroup attributes","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:targetgroup/k8s-default-windowss-74a8b74c3a/8ba6196e3a53b4d5"}
{"level":"info","ts":1640517945.0289276,"logger":"controllers.service","msg":"creating loadBalancer","stackID":"default/windows-server-iis-service","resourceID":"LoadBalancer"}
{"level":"info","ts":1640517945.355083,"logger":"controllers.service","msg":"created loadBalancer","stackID":"default/windows-server-iis-service","resourceID":"LoadBalancer","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:loadbalancer/net/k8s-default-windowss-00aaf49d08/658ca088d59b9e98"}
{"level":"info","ts":1640517945.3897755,"logger":"controllers.service","msg":"creating listener","stackID":"default/windows-server-iis-service","resourceID":"80"}
{"level":"info","ts":1640517945.445823,"logger":"controllers.service","msg":"created listener","stackID":"default/windows-server-iis-service","resourceID":"80","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:listener/net/k8s-default-windowss-00aaf49d08/658ca088d59b9e98/4d9e36830e690bad"}
{"level":"info","ts":1640517945.4773517,"logger":"controllers.service","msg":"creating targetGroupBinding","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80"}
{"level":"info","ts":1640517945.8155098,"logger":"controllers.service","msg":"created targetGroupBinding","stackID":"default/windows-server-iis-service","resourceID":"default/windows-server-iis-service:80","targetGroupBinding":{"namespace":"default","name":"k8s-default-windowss-74a8b74c3a"}}
{"level":"info","ts":1640517945.81556,"logger":"controllers.service","msg":"successfully deployed model","service":{"namespace":"default","name":"windows-server-iis-service"}}
{"level":"info","ts":1640517946.0076733,"msg":"authorizing securityGroup ingress","securityGroupID":"sg-0eec5b14b911a3509","permission":[{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"106.214.135.249/32","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null},{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"172.14.0.0/23","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null},{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"172.12.2.0/23","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null}]}
{"level":"info","ts":1640517946.211971,"msg":"authorized securityGroup ingress","securityGroupID":"sg-0eec5b14b911a3509"}
{"level":"info","ts":1640517946.4147427,"msg":"registering targets","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:targetgroup/k8s-default-windowss-74a8b74c3a/8ba6196e3a53b4d5","targets":[{"AvailabilityZone":null,"Id":"172.12.25.181","Port":80}]}
{"level":"info","ts":1640517946.685702,"msg":"registered targets","arn":"arn:aws:elasticloadbalancing:us-west-2:356817424680:targetgroup/k8s-default-windowss-74a8b74c3a/8ba6196e3a53b4d5"}
@vmasule from the logs, looks like the inbound rule is added to the worker SG correctly:
{"level":"info","ts":1640517946.0076733,"msg":"authorizing securityGroup ingress","securityGroupID":"sg-0eec5b14b911a3509","permission":[{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"106.214.135.249/32","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null},{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"172.14.0.0/23","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null},{"FromPort":80,"IpProtocol":"tcp","IpRanges":[{"CidrIp":"172.12.2.0/23","Description":"elbv2.k8s.aws/targetGroupBinding=shared"}],"Ipv6Ranges":null,"PrefixListIds":null,"ToPort":80,"UserIdGroupPairs":null}]}
I assume it's 106.214.135.249/32
in this case.
You mentioned I am currently whitelisting my ISP's public IP for testing which works fine when nlb-target-type: instance.
. How did you do the whitelist? isn't it the IP get added to SG automatically?
@M00nF1sh and @kishorj
That's right rules are added correctly but when I access the web app through browser it becomes unreachable(i.e it does fail and I did trying enabling the NLB logs but nothing gets logged there)
I assume it's 106.214.135.249/32 in this case.
That's right, that is public IP added in SG rules.
You mentioned I am currently whitelisting my ISP's public IP for testing which works fine when nlb-target-type: instance.. How did you do the whitelist? isn't it the IP get added to SG automatically?
That's right, IP gets added to SG automatically and I can access the web app in this case, but we don't want to use nlb-target-type: instance
because of it's limitations.
I don't know if it is windows container base setup that is causing this issue.
@kishorj @M00nF1sh Are you guys able to reproduce this issue, pls let me know I very badly need this feature before production.
@vmasule, could you check if there is a proxy between your client and the load balancer? Also check if you have any configurations to allow traffic to the node port range or any specific configuration to block port 80. NLB logs are limited to the TLS connections only. You could use VPC flow logs to debug further.
@vmasule, any updates?
@kishorj Sorry today only checked this message, sure will check on requested detail and update here.
Update:
- There is no Proxy between client and the Load Balancer
- There is no specific configuration to block port 80
- But there are node port range to allow the traffic inside but that is also added by AWS Load Balancer Controller
This is very weird issue, may be we will try to with Linux based workloads and see if issue is same for that also, and update here but definitely not working for Windows based workloads.
@kishorj I can say that for linux workloads it is working fine, but for windows only it is causing this issue.
Could you pls reproduce at your end and confirm?? This has become major issue when it comes to security and restricting blast radius for us. Pls help.
@kishorj Is there any Kubelet flag need to be added for this to work for aws-load-balancer-nlb-target-type: ip
. Pls help.
which CNI plugin do you use?
@kishorj I am using AWS-CNI only that comes by default with EKS.
@kishorj Any updates pls?
I'm seeing the same problem on our deployment.
I think the issue is with the IP target type, the load balancer controller registers the POD as targets directly which bypasses the node.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Facing same issue. No SG rules added when target is ip
@kishorj Other people also facing this issue, any updates on this?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@pmnhatdn: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
We're facing the same issue, any updates on this?