ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

preStop hook should account for AWS NLB deregistration delay

Open youwalther65 opened this issue 5 months ago • 1 comments

Currently during HPA scale-in of NGINX Ingress pods sometimes are terminated faster then the AWS NLB target deregistration delay deregistration_delay.timeout_seconds, which leads to traffic send to non-existing targets/pods and client errors like HTTP 499.

Currently I am working around that by adding a sleep equal to deregistration_delay.timeout_seconds to the preStop hook like 270s in this case:

      lifecycle:
        preStop:
          exec:
            command:
            - /bin/sh
            - -c
            - sleep 270; /wait-shutdown

I don't change terminationGracePeriodSeconds for now, so this leaves NGINX 30s to shutdown connections.

This can be done in Helm values.yaml which contains a section controller.lifecycle together with the relevant AWS NLB service annotation like:

  service:
    annotations:
...
     service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: deregistration_delay.timeout_seconds=270
...
  lifecycle:
    preStop:
      exec:
        command:
          - /bin/sh
          - -c
          - sleep 270; /wait-shutdown

Any better idea to solve this?

youwalther65 avatar Aug 27 '24 09:08 youwalther65