aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
Support for NLBs targeting ALBs
Issue
Support for NLBs targeting ALBs
Description
The goal of this pull request is to add support for targeting ALBs from NLBs by adding the "alb" value to the existing nlb target type annotation. It is currently a work in progress. If possible, it would be great to hear from the maintainers whether this is a feature that you are willing to support in this project. And if so, do you have any thoughts on what the service spec should look like for targeting an ALB? (see below). Thank you in advance!
Example:
# Application Load Balancer Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: echoserver-alb <-- The ingress can be selected by name from the nlb service using the "ingress" selector
namespace: echoserver
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/tags: ...
alb.ingress.kubernetes.io/subnets: ...
spec:
defaultBackend:
service:
name: echoserver
port:
number: 80
# Network Load Balancer Service
apiVersion: v1
kind: Service
metadata:
name: echoserver-nlb
namespace: echoserver
annotations:
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "alb" <-- New alb option
spec:
# Not sure how everything below should look
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: LoadBalancer
selector:
ingress: echoserver-alb <-- Select the alb using "ingress" selector
Also, would it make sense to extend the TargetGroupBinding CRD to support something like an IngressRef for this scenario (since we are targeting an Ingress instead of a Service)?
Example:
apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
name: my-tgb
spec:
ingressRef: <-- new option for ALB target type(?)
name: awesome-ingress # route traffic to the awesome-ingress ALB
port: 80
targetGroupARN: <arn-to-targetGroup>
Checklist
- [x] Added tests that cover your change (if possible) IN PROGRESS
- [ ] Added/modified documentation as required (such as the
README.md
, or thedocs
directory) TODO - [x] Manually tested
- [x] Made sure the title of the PR is a good description that can go into the release notes
BONUS POINTS checklist: complete for good vibes and maybe prizes?! :exploding_head:
- [ ] Backfilled missing tests for code in same general area :tada:
- [x] Refactored something and made the world a better place :star2:
Welcome @jon-fearer!
It looks like this is your first PR to kubernetes-sigs/aws-load-balancer-controller 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/aws-load-balancer-controller has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @jon-fearer. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jon-fearer
To complete the pull request process, please assign m00nf1sh after the PR has been reviewed.
You can assign the PR to them by writing /assign @m00nf1sh
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
This would be amazing, def willing to test.
The committers are authorized under a signed CLA.
- :white_check_mark: jon-fearer (f1f3d2bcca2cfac14cb4468af8ba790c05f270c8, 5ad66702a764376999170f39aec68bdde955403c)
I've implemented the spec changes outlined above and tested manually. I'll hold off on any other changes for now
@jon-fearer is this still WIP? If so, consider marking it as a draft (via GitHub's UI / API)
Any update about it? We need this functionality for our solution.
@lado-golijashvili-rft I am looking for some directional feedback from maintainers (see proposed schemas above). I guess that would be @kishorj or @M00nF1sh ? Not 100% sure though
@jon-fearer: PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
@jon-fearer any active work on thir pr ? This will be amazing.
@jon-fearer any active work on thir pr ? This will be amazing.
The lack of any maintainer feedback has stalled his work.
Hi, folks!
This feature would be a life saver for us!
Can anyone unblock this, please?
Thank you very much!
Would definitely use this.
@kishorj @M00nF1sh - any update? We have been waiting for this functionality. It is a common use case.
Another common use case is creating and managing both the NLB and associated backend ALB and targetgroup with this controller
@jon-fearer @smcavallo For your use case, would there be multiple ALBs behind NLB or just a single ALB behind a NLB?
Personally i don't favor current approach as the Kubernetes Service object is designed to target pods only. Using Service objects to model AWS NLB in this scenario is not appropriate to me.
Possible alternatives in my mind:
- use Gateway API(https://gateway-api.sigs.k8s.io/references/spec/#gateway.networking.k8s.io/v1alpha2.TCPRouteRule) and backendRefs to reference ALBs. We have plans to support gateway APIs in upcoming major releases. (This feels better to me)
- use annotation on Ingress or field on IngressClass to specify a unique identifier representing a NLB(similar how ingressGroup's name works), and for each such identifier create a NLB and attach these NLBs to it.
any thoughts? @kishorj
@M00nF1sh - this is our use case:
We accept both tcp and http traffic from customers.
We have a public facing NLB at the perimeter managed by aws-load-balancer-controller
which accepts both tcp and http traffic
- tcp traffic is routed to kubernetes ingress/pods just like any other use case of
aws-load-balancer-controller
- http traffic is routed to ALB target group (below)
we have an internal facing ALB managed by aws-load-balancer-controller
with WAF and TLS/SNI for http traffic
- http traffic is routed to kubernetes ingress/pods just like any other use case of
aws-load-balancer-controller
(we can't offload tls/sni to something inside the cluster - we need WAF, aws certificates, and other aws specific integrations and ALB functionality.)
Everything is out of the box aws-load-balancer-controller
and working as it's designed to be used.
The only thing we can't do is wire up the ALB (managed by aws-load-balancer-controller
) to sit behind the NLB (managed by the aws-load-balancer-controller
)
Since aws-load-balancer-controller
is managing BOTH the NLB and ALB it seems like it should be able to connect the 2.
The alternative workaround is for us to manage both the NLB and ALB via infrastructure-as-code (crossplane!) and then use the TargetGroup
functionality of aws-load-balancer-controller
to make sure the nodes/pods exist as part of the target groups.
It feels like overkill to use to manage the entire config some other way when aws-load-balancer-controller
can create and maintain everything, it just can't connect the infrastructure it created.
https://aws.amazon.com/blogs/networking-and-content-delivery/application-load-balancer-type-target-group-for-network-load-balancer/
All the http targets and tcp targets exist within kubernetes. we just have to route the http traffic through an ALB.
We could make a very narrow implementation of the Gateway API that deploys an NLB and ALB (and, if we want, integrates with PrivateLink).
In Kubernetes, the Service API doesn't express a desired state that looks like that diagram @jon-fearer. That is why we shouldn't use Service to manage it.
If someone wants to add complexity to the AWS Load Balancer controller to add Gateway, they can - it's open source. However, that is a big ask.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
@M00nF1sh
I am wondering if you can expand upon why you find it inappropriate to model the nlb with the service object?
I am encountering the use case needing to inject an ALB for http to https redirection at the nlb. Load Balancers managed by aws-load-balancer-controller, this becomes quite the difficult ask. As @smcavallo pointed out earlier we could move this logic into IaC and manage with target groups but I find this to be a less desirable option as we prefer to manage the LB's with the controller rather than with IaC.
I think this functionality also opens additional feature functionality that does align with this controller and how it functions. This feature would also allow implementations of a service defining an nlb targetting an ingress defining an alb. This functionality would be able to model any complex networking use-cases that load-balancers support.
I think this change can only be positive. I think that the aws-load-balancer-controller should be able to model and implement any of the features and capacities of load balancers in aws. Allowing the controller to be agnostic of implementation details allows it to remain in feature parity and be in lock step with additional aws features released down the road.
/remove-lifecycle stale
@matthewdaltonfanduel The main reason is Kubernetes Service object is designed to target pods instead of targeting other objects like ALB. We are just abusing Kubernetes Service spec if we implement this way. Personally i still prefer we implement this by the Gateway API, https://gateway-api.sigs.k8s.io/, which is more flexible.
There is a on-going Gateway API implementation from community folks: https://github.com/kubernetes-sigs/aws-load-balancer-controller/pull/3146, we can consider implement this use case with that.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Looks like @M00nF1sh has suggested a better solution here, so I'm going to close this. Thanks all