aws-load-balancer-controller
aws-load-balancer-controller copied to clipboard
NLB IP Target registration is extremely slow
I've configured an NLB with IP targets to point to an instance of Traefik 2 and noticed that when I have pod readiness gates enabled, it might take upwards of 5 minutes for a single target to register and be considered healthy. Is this normal/expected?
The NLB target registration can take from 90 to 180 seconds to complete. After registration, the targets are marked healthy only after the configured health check passes. This delay is from the AWS NLB and is currently expected. It is not due to the pod readiness gate configuration.
In case of rolling updates to your application, the pod readiness gate helps mitigate the effects of this delay by making sure the existing pods will not be terminated until the newly registered targets show up as healthy.
Ah, thank you. Is there anything at all I can do to help speed that up?
@abatilo You can contact NLB team via support ticket to accelerate it.
From the controller's perspective, we will add some docs to note this limitation on our docs.
/kind documentation
@abatilo - I would encourage anyone encountering this issue to reach out to AWS support. They are aware of the issue. AFAIK, it has been an issue for 3+ years (per stackoverflow). The more people that contact them, the more likely it will get fixed. ;)
Can confirm that I've observed the same behaviour when testing NLB ingress with IP targets @abatilo. The controller registers a new pod with the target group within few seconds. I'd expect the NLB health check to kick in and register the service in 20-30s (2 or 3 health cheks, 10s interval). Instead of 20-30s, it's 3-5minutes.
I can confirm this is still present... feels a bit like AWS is letting people down by delaying a fix for it for so long...
I don't think they see it as a bug :) This is not related to k8s or the load balancer controller and probably doesn't belong here. If you want NLB to take less than 3 minutes to register targets, tell your AWS support rep!
@jbg I did, they mentioned this thread in their response... "Mom, can I go out?" "Ask your dad!" "Dad?" "Ask your mom!" XD
If you're contacting AWS support about this, it's probably advisable to demonstrate the issue with an NLB provisioned manually or via CloudFormation, so that first-level support can't point the finger at aws-load-balancer-controller or k8s as the source of the delay.
@jbg I did, they mentioned they will add a note with my case to the existing issue on the NLB.
@paul-lupu NLB team is already aware of this issue and have fixes in progress. They already rolled out a new HC system that slightly improve the registration time and plan to improve it to be <=60sec(I don't have an ETA on this).
From the controller's point of view, we cannot do much until NLB improves it. If the registration time is a concern, we can use NLB instance mode(supported by newer version of this controller as well). If spot instances is been used, we can use nodeSelectors(service.beta.kubernetes.io/aws-load-balancer-target-node-labels) to use no-spot instances as NLB backend.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
I have same issue with NLB fronting ECS containers. Very frustrating. The container is up and receives and responds to healthchecks in seconds, yet the TG takes minutes to recognize a container has healthy. If containers go down for some reason and need to be re-run this could potentially leave a major gap in service availability. Makes the NLB problematic to use, but I have to use it in order to do the TCP passthrough I need to the ECS container so that we can implement mTLS at the container. Very frustrating delay. Is there a general NLB ticket/issue that anyone might have a link to that I can help pile on with?
@bennettellis since it is a problem with AWS internal implementation rather than any open-source component, the best place to "pile on" is your AWS support
I worked around this with traefik by setting the nlb deregistration timeout, deployment pod grace period, and container grace period to 5 minutes. I also needed to ensure the container health-check continued to report success during this inbetween time with traefik's --ping.terminatingStatusCode=204
.
This leaves the old pod in a "terminating but still running" state for 5 minutes to give the nlb time to complete the registration process for the new pod.
NAME READY STATUS RESTARTS AGE
traefik-5fc5468b49-7htxk 1/1 Terminating 0 6m10s
traefik-5fc5468b49-hgdll 1/1 Terminating 0 5m50s
traefik-5fc5468b49-t2qk7 1/1 Terminating 0 6m31s
traefik-f5c4b56fb-478hc 1/1 Running 0 29s
traefik-f5c4b56fb-7858g 1/1 Running 0 49s
traefik-f5c4b56fb-czqbj 1/1 Running 0 69s
nlb service
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: deregistration_delay.timeout_seconds=300
k8s deployment pods
deployment:
terminationGracePeriodSeconds: 315
traefik container settings
--ping.terminatingStatusCode=204
--entrypoints.metrics.transport.lifecycle.requestacceptgracetimeout=5m
--entrypoints.traefik.transport.lifecycle.requestacceptgracetimeout=5m
--entrypoints.web.transport.lifecycle.requestacceptgracetimeout=5m
--entrypoints.websecure.transport.lifecycle.requestacceptgracetimeout=5m
If your app doesn't have a feature like this I think the container delay could also be accomplished with a container lifecycle sleep()
as long as the health-check continues reporting success.
containers:
- name: application
lifecycle:
preStop:
exec:
command: [
"sh", "-c",
# Introduce a delay to the shutdown sequence to wait for the
# pod eviction event to propagate. Then, gracefully shutdown
"sleep 300 && killall -SIGTERM application",
]
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Unfortunately 5 minutes is too long for when using spot
/remove-lifecycle stale
Anyone has any update on this ? Do we have a way to determine when the registration is actually completed and pod start receiving requests from the NLB ?
@nbourdeau - use pod readiness gates. They are super easy to setup and essentially your pod will not be considered ready until the LB sees the target as healthy. Unfortunately, deployments are slow, but at least with pod readiness gates they are stable.
@nbourdeau - use pod readiness gates. They are super easy to setup and essentially your pod will not be considered ready until the LB sees the target as healthy. Unfortunately, deployments are slow, but at least with pod readiness gates they are stable.
Well in my use case this is not really usable because it is a singleton deployment with an EBS volume mounted and I cannot have 2 pods with same volume running at the same time ...
But the strange thing is the NLB target is marked healthy in the target group but there is still a delay before the pod actually start receiving requests ... will that even work with pod readiness gates ?
There is a way to improve this time? Its taking more than 5 min when I use NLB with TCP protocol.
There is a way to improve this time? Its taking more than 5 min when I use NLB with TCP protocol.
seems like the answer is no ... I contacted AWS support and the answer is: we are working on improving the delay ... use pod readiness gates if you can ...
I will comment with more details when time permits but note you can reduce de-registration to around 45s (from over 3 minutes) using http(s) instead of tcp healthchecks. which help a lot for spot (and the only 2 minutes notice you get there for reclaim, ie you can get error less spot drain now)
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
@ldemailly can you share your health check configuration ? I'm trying HTTP but it still can take almost two minutes to deregister targets
@roimor
serviceAnnotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: HTTPS
service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8443"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /ready
combined with, istio/envoy wise:
containers:
- name: istio-proxy
lifecycle:
preStop:
exec:
command:
- "/bin/bash"
- "-c"
- curl -XPOST http://localhost:15000/healthcheck/fail && sleep 45 &&
curl -XPOST http://localhost:15000/drain_listeners?graceful && sleep 5
to reach the envoy internal health endpoint:
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: localhost-service-entry
namespace: istio-system
spec:
hosts:
- localhost.service.entry
ports:
- number: 15099
name: http-port
protocol: HTTP
targetPort: 15000
location: MESH_INTERNAL
resolution: STATIC
endpoints:
- address: 127.0.0.1
and
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: external-virtualservice
namespace: istio-system
spec:
hosts:
- '*'
gateways:
- ...
http:
- name: http-hc-route
match:
- uri:
exact: /ready
route:
- destination:
host: localhost.service.entry
port:
number: 15099
and because the hc will come with IP as target you need to serve certs on *
- port:
number: 443
name: https-no-sni
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: your-cert
hosts:
- "*"
Out of curiosity: Is this an issue for ALB IP target mode as well?
@youwalther65, ALB IP target registration is faster than NLB.
the issue being discussed is precisely NLBs, not ALBs. There are many cases where an ALB is not an appropriate solution.