aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

NLB IP Target registration is extremely slow

Open abatilo opened this issue 3 years ago • 61 comments

I've configured an NLB with IP targets to point to an instance of Traefik 2 and noticed that when I have pod readiness gates enabled, it might take upwards of 5 minutes for a single target to register and be considered healthy. Is this normal/expected?

abatilo avatar Feb 18 '21 15:02 abatilo

The NLB target registration can take from 90 to 180 seconds to complete. After registration, the targets are marked healthy only after the configured health check passes. This delay is from the AWS NLB and is currently expected. It is not due to the pod readiness gate configuration.

In case of rolling updates to your application, the pod readiness gate helps mitigate the effects of this delay by making sure the existing pods will not be terminated until the newly registered targets show up as healthy.

kishorj avatar Feb 18 '21 18:02 kishorj

Ah, thank you. Is there anything at all I can do to help speed that up?

abatilo avatar Feb 19 '21 10:02 abatilo

@abatilo You can contact NLB team via support ticket to accelerate it.

From the controller's perspective, we will add some docs to note this limitation on our docs.

/kind documentation

M00nF1sh avatar Mar 03 '21 22:03 M00nF1sh

@abatilo - I would encourage anyone encountering this issue to reach out to AWS support. They are aware of the issue. AFAIK, it has been an issue for 3+ years (per stackoverflow). The more people that contact them, the more likely it will get fixed. ;)

keperry avatar Apr 09 '21 13:04 keperry

Can confirm that I've observed the same behaviour when testing NLB ingress with IP targets @abatilo. The controller registers a new pod with the target group within few seconds. I'd expect the NLB health check to kick in and register the service in 20-30s (2 or 3 health cheks, 10s interval). Instead of 20-30s, it's 3-5minutes.

juozasget avatar Jun 25 '21 20:06 juozasget

I can confirm this is still present... feels a bit like AWS is letting people down by delaying a fix for it for so long...

paul-lupu avatar Sep 10 '21 12:09 paul-lupu

I don't think they see it as a bug :) This is not related to k8s or the load balancer controller and probably doesn't belong here. If you want NLB to take less than 3 minutes to register targets, tell your AWS support rep!

jbg avatar Sep 10 '21 12:09 jbg

@jbg I did, they mentioned this thread in their response... "Mom, can I go out?" "Ask your dad!" "Dad?" "Ask your mom!" XD

paul-lupu avatar Sep 10 '21 13:09 paul-lupu

If you're contacting AWS support about this, it's probably advisable to demonstrate the issue with an NLB provisioned manually or via CloudFormation, so that first-level support can't point the finger at aws-load-balancer-controller or k8s as the source of the delay.

jbg avatar Sep 10 '21 13:09 jbg

@jbg I did, they mentioned they will add a note with my case to the existing issue on the NLB.

paul-lupu avatar Sep 10 '21 13:09 paul-lupu

@paul-lupu NLB team is already aware of this issue and have fixes in progress. They already rolled out a new HC system that slightly improve the registration time and plan to improve it to be <=60sec(I don't have an ETA on this).

From the controller's point of view, we cannot do much until NLB improves it. If the registration time is a concern, we can use NLB instance mode(supported by newer version of this controller as well). If spot instances is been used, we can use nodeSelectors(service.beta.kubernetes.io/aws-load-balancer-target-node-labels) to use no-spot instances as NLB backend.

M00nF1sh avatar Sep 10 '21 17:09 M00nF1sh

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Dec 09 '21 18:12 k8s-triage-robot

I have same issue with NLB fronting ECS containers. Very frustrating. The container is up and receives and responds to healthchecks in seconds, yet the TG takes minutes to recognize a container has healthy. If containers go down for some reason and need to be re-run this could potentially leave a major gap in service availability. Makes the NLB problematic to use, but I have to use it in order to do the TCP passthrough I need to the ECS container so that we can implement mTLS at the container. Very frustrating delay. Is there a general NLB ticket/issue that anyone might have a link to that I can help pile on with?

bennettellis avatar Jan 05 '22 04:01 bennettellis

@bennettellis since it is a problem with AWS internal implementation rather than any open-source component, the best place to "pile on" is your AWS support

jbg avatar Jan 05 '22 04:01 jbg

I worked around this with traefik by setting the nlb deregistration timeout, deployment pod grace period, and container grace period to 5 minutes. I also needed to ensure the container health-check continued to report success during this inbetween time with traefik's --ping.terminatingStatusCode=204.

This leaves the old pod in a "terminating but still running" state for 5 minutes to give the nlb time to complete the registration process for the new pod.

NAME                                      READY   STATUS        RESTARTS   AGE
traefik-5fc5468b49-7htxk                  1/1     Terminating   0          6m10s
traefik-5fc5468b49-hgdll                  1/1     Terminating   0          5m50s
traefik-5fc5468b49-t2qk7                  1/1     Terminating   0          6m31s
traefik-f5c4b56fb-478hc                   1/1     Running       0          29s
traefik-f5c4b56fb-7858g                   1/1     Running       0          49s
traefik-f5c4b56fb-czqbj                   1/1     Running       0          69s

nlb service

service:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: external
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
    service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: deregistration_delay.timeout_seconds=300

k8s deployment pods

deployment:
  terminationGracePeriodSeconds: 315

traefik container settings

  --ping.terminatingStatusCode=204
  --entrypoints.metrics.transport.lifecycle.requestacceptgracetimeout=5m
  --entrypoints.traefik.transport.lifecycle.requestacceptgracetimeout=5m
  --entrypoints.web.transport.lifecycle.requestacceptgracetimeout=5m
  --entrypoints.websecure.transport.lifecycle.requestacceptgracetimeout=5m

If your app doesn't have a feature like this I think the container delay could also be accomplished with a container lifecycle sleep() as long as the health-check continues reporting success.

containers:
  - name: application
    lifecycle:
      preStop:
        exec:
          command: [
            "sh", "-c",
            # Introduce a delay to the shutdown sequence to wait for the
            # pod eviction event to propagate. Then, gracefully shutdown
            "sleep 300 && killall -SIGTERM application",
          ]

rayjanoka avatar Jan 12 '22 03:01 rayjanoka

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 12 '22 18:04 k8s-triage-robot

Unfortunately 5 minutes is too long for when using spot

/remove-lifecycle stale

ldemailly avatar Apr 12 '22 18:04 ldemailly

Anyone has any update on this ? Do we have a way to determine when the registration is actually completed and pod start receiving requests from the NLB ?

nbourdeau avatar Apr 28 '22 13:04 nbourdeau

@nbourdeau - use pod readiness gates. They are super easy to setup and essentially your pod will not be considered ready until the LB sees the target as healthy. Unfortunately, deployments are slow, but at least with pod readiness gates they are stable.

keperry avatar Apr 28 '22 14:04 keperry

@nbourdeau - use pod readiness gates. They are super easy to setup and essentially your pod will not be considered ready until the LB sees the target as healthy. Unfortunately, deployments are slow, but at least with pod readiness gates they are stable.

Well in my use case this is not really usable because it is a singleton deployment with an EBS volume mounted and I cannot have 2 pods with same volume running at the same time ...

But the strange thing is the NLB target is marked healthy in the target group but there is still a delay before the pod actually start receiving requests ... will that even work with pod readiness gates ?

nbourdeau avatar Apr 28 '22 15:04 nbourdeau

There is a way to improve this time? Its taking more than 5 min when I use NLB with TCP protocol.

hellenavilarosa avatar May 25 '22 18:05 hellenavilarosa

There is a way to improve this time? Its taking more than 5 min when I use NLB with TCP protocol.

seems like the answer is no ... I contacted AWS support and the answer is: we are working on improving the delay ... use pod readiness gates if you can ...

nbourdeau avatar May 25 '22 18:05 nbourdeau

I will comment with more details when time permits but note you can reduce de-registration to around 45s (from over 3 minutes) using http(s) instead of tcp healthchecks. which help a lot for spot (and the only 2 minutes notice you get there for reclaim, ie you can get error less spot drain now)

ldemailly avatar May 26 '22 01:05 ldemailly

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jun 25 '22 02:06 k8s-triage-robot

/remove-lifecycle rotten

choeffer avatar Jun 25 '22 06:06 choeffer

@ldemailly can you share your health check configuration ? I'm trying HTTP but it still can take almost two minutes to deregister targets

roimor avatar Jul 20 '22 08:07 roimor

@roimor

    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb-ip"
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: HTTPS
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8443"
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /ready

combined with, istio/envoy wise:

            containers:
              - name: istio-proxy
                lifecycle:
                  preStop:
                    exec:
                      command:
                        - "/bin/bash"
                        - "-c"
                        - curl -XPOST http://localhost:15000/healthcheck/fail && sleep 45 &&
                          curl -XPOST http://localhost:15000/drain_listeners?graceful && sleep 5

to reach the envoy internal health endpoint:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: localhost-service-entry
  namespace: istio-system
spec:
  hosts:
    - localhost.service.entry
  ports:
    - number: 15099
      name: http-port
      protocol: HTTP
      targetPort: 15000
  location: MESH_INTERNAL
  resolution: STATIC
  endpoints:
    - address: 127.0.0.1

and

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: external-virtualservice
  namespace: istio-system
spec:
  hosts:
    - '*'
  gateways:
    - ...
  http:
    - name: http-hc-route
      match:
        - uri:
            exact: /ready
      route:
        - destination:
            host: localhost.service.entry
            port:
              number: 15099

and because the hc will come with IP as target you need to serve certs on *

    - port:
        number: 443
        name: https-no-sni
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: your-cert
      hosts:
      - "*" 

ldemailly avatar Jul 20 '22 16:07 ldemailly

Out of curiosity: Is this an issue for ALB IP target mode as well?

youwalther65 avatar Jul 28 '22 14:07 youwalther65

@youwalther65, ALB IP target registration is faster than NLB.

kishorj avatar Jul 28 '22 18:07 kishorj

the issue being discussed is precisely NLBs, not ALBs. There are many cases where an ALB is not an appropriate solution.

bennettellis avatar Jul 28 '22 18:07 bennettellis