kubernetes-ingress icon indicating copy to clipboard operation
kubernetes-ingress copied to clipboard

occassional hang on config reload

Open hostalp opened this issue 3 years ago • 9 comments

Describe the bug We're running this ingress controller with proxy protocol enabled (there's a load balancer in the front of it that uses this protocol for passing the original request info to its backends) and we're encountering occassional hang of some instance (1 out of total 3 running) after the config reload.

In logs we can see that there's a reload action, but that's where the log ends as there's nothing following that. The pod appears to be running and connections to the controller can be estalished (only there's no response from the controller itself) so kubernetes just lets things remain in that state (2 controller instances ok, 1 broken).

The end result is that some requests fail while others work fine - depending on where the request is routed. In such case we have to restart the controller (kubectl rollout restart ...) to get things back to fully working state.

I'm not sure it it could be related to the use of the proxy protocol, but that's the only thing I've found as possibly related to this kind of issue - see links below for examples of some similar cases. It could be however something completely different.

If there's any additional information you need please let me know.

Your environment

  • Version of the Ingress Controller: 2.2.0 (Helm Chart)
  • Version of Kubernetes: 1.20.10
  • Using NGINX or NGINX Plus: NGINX

Additional context Proxy protocol is enabled (proxy-protocol: "True" + set-real-ip-from: "..." + real-ip-header: "proxy_protocol"), therefore it could be related to something like: https://www.ibm.com/support/pages/ingress-controller-may-stop-processing-requests-and-hang https://www.ibm.com/support/pages/apar/LI80910

hostalp avatar Jun 06 '22 12:06 hostalp

Hi @hostalp thanks for reporting!

Be sure to check out the docs while you wait for a human to take a look at this :slightly_smiling_face:

Cheers!

github-actions[bot] avatar Jun 06 '22 12:06 github-actions[bot]

Are you able to share log output from the failed pod? That should be able to tell us a bit more.

brianehlert avatar Jun 06 '22 13:06 brianehlert

@hostalp can you share your deployment files (including configmap)? Any logs as well would be helpful?

jasonwilliams14 avatar Jun 06 '22 17:06 jasonwilliams14

I will provide logs next time this happens as the older ones are unavailable already. The installation was via Helm using these customized values (the installation & upgrade procedures are also described at the top of the file):

# helm repo add nginx https://helm.nginx.com/stable
# helm repo update
# helm upgrade --install nginx-ingress nginx/nginx-ingress --create-namespace -n ingress -f nginx-ingress-values_DEV.yaml

## Upgrade
# kubectl apply -f crds/
# helm upgrade --install nginx-ingress nginx/nginx-ingress --create-namespace -n ingress -f nginx-ingress-values_DEV.yaml

controller:
  ## The name of the Ingress controller daemonset or deployment.
  ## Autogenerated if not set or set to "".
  # name: nginx-ingress


  ## The kind of the Ingress controller installation - deployment or daemonset.
  kind: daemonset

  config:
    ## The entries of the ConfigMap for customizing NGINX configuration.
    entries:
      # Extended log format, includes related K8S resource information
      log-format: '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" "$resource_name" "$resource_namespace" "$service"'

      # PROXY protocol, enabled on the load balancer, used to obtain correct client IP address
      proxy-protocol: "True"
      set-real-ip-from: "10.42.0.0/16,10.2.20.0/22"
      real-ip-header: "proxy_protocol"

  ## It is recommended to use your own TLS certificates and keys
  defaultTLS:
    ## The secret with a TLS certificate and key for the default HTTPS server.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.defaultTLS.cert` and `controller.defaultTLS.key` parameters.
    ## Format: <namespace>/<secret_name>
    secret: &defaultTLSSecret cert-manager/dev-bcloud-tls

  wildcardTLS:
    ## The secret with a TLS certificate and key for every Ingress/VirtualServer host that has TLS enabled but no secret specified.
    ## The value must follow the following format: `<namespace>/<name>`.
    ## Used as an alternative to specifying a certificate and key using `controller.wildcardTLS.cert` and `controller.wildcardTLS.key` parameters.
    ## Format: <namespace>/<secret_name>
    secret: *defaultTLSSecret

  ## The resources of the Ingress controller pods.
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: "1"
      memory: 1Gi

  ## The volumes of the Ingress controller pods.
  volumes:
    - name: tz-volume
      hostPath:
        path: /usr/share/zoneinfo/Europe/Prague

  ## The volumeMounts of the Ingress controller pods.
  volumeMounts:
    - name: tz-volume
      mountPath: /etc/localtime
      readOnly: true

  ## Only for Kubernetes >= 1.18
  ## New Ingresses without an ingressClassName field specified will be assigned the class specified in `controller.ingressClass`.
  setAsDefaultIngress: true

  ## Enable the custom resources.
  enableCustomResources: true

  ## Enable custom NGINX configuration snippets in Ingress, VirtualServer, VirtualServerRoute and TransportServer resources.
  enableSnippets: true

  service:
    ## The type of service to create for the Ingress controller.
    type: NodePort

    ## The externalTrafficPolicy of the service. The value Local preserves the client source IP.
    # NOTE: Must be set to Cluster for requests coming via external load balancer to succeed 
    externalTrafficPolicy: Cluster

    httpPort:
      ## Enables the HTTP port for the Ingress controller service.
      enable: true

      ## The HTTP port of the Ingress controller service.
      port: 80

      ## The custom NodePort for the HTTP port. Requires controller.service.type set to NodePort.
      nodePort: "32080"

      ## The HTTP port on the POD where the Ingress controller service is running.
      targetPort: 80

    httpsPort:
      ## Enables the HTTPS port for the Ingress controller service.
      enable: true

      ## The HTTPS port of the Ingress controller service.
      port: 443

      ## The custom NodePort for the HTTPS port. Requires controller.service.type set to NodePort.
      nodePort: "32443"

      ## The HTTPS port on the POD where the Ingress controller service is running.
      targetPort: 443

  readyStatus:
    ## Enables readiness endpoint "/nginx-ready". The endpoint returns a success code when NGINX has loaded all the config after startup.
    enable: false

hostalp avatar Jun 06 '22 21:06 hostalp

This is the last case from today. Complete config reload since the beginning until the end of the log (there are no further events logged after that). nginx-ingress-nginx-ingress-hncm6_reconfig.log

I'm not 100% sure if it's always the case but it appears that in the cases of successful config reloads there aren't any of those alert messages (fork() failed while spawning "worker process" (11: Resource temporarily unavailable), sendmsg() failed (9: Bad file descriptor)) logged at all.

It appears it could be hitting some resource limit, though I don't have a direct access to any of the nodes in order to obtain more information. As for the Kubernetes level cpu & memory resource settings, these are the current ones (we increased the memory limit some time ago from the one shown in my previous message)

    resources:
      limits:
        cpu: "1"
        memory: 1536Mi
      requests:
        cpu: 100m
        memory: 256Mi

Each node has 16 CPUs and 40 GB RAM and it isn't running out of any of there resources.

EDIT: I've found cases of successful config reloads with the same alerts nginx-ingress-nginx-ingress-hncm6_reconfig_OK.log, nginx-ingress-nginx-ingress-hncm6_reconfig_OK2.log (the same nginx instance earlier today, there were also few others in-between without those alerts logged) so they aren't always an indication of this issue.

hostalp avatar Jun 10 '22 09:06 hostalp

If you were hitting a resource limit I would expect that you would be reporting an OOMKill or see it in the logs.

I just noticed the volume mounts. Could there be something happening with those? That could relate to temporarily unavailable or bad file descriptor

brianehlert avatar Jun 10 '22 17:06 brianehlert

That mount is just 1 time zone data file used to set the time zone of the running container. I wouldn't expect any issue with that and moreover it's used especially during the initialization of the container.

We found out that today there were some issues related to the inability to create threads (runtime: failed to create new OS thread (have 9 already; errno=11) etc. where errno=11 means this Resource temporarily unavailable) affecting some pods which could be behind that. Our infrastructure provider doesn't know why it happened but we'll keep an eye on it.

hostalp avatar Jun 10 '22 18:06 hostalp

Hi @hostalp

I would recommend double checking the memory usage of the KIC pods. If the usage is close enough to the limit, you will definitely experience issues related to not enough available memory. That's because during a configuration reload, NGINX will spawn new worker processes while the old one are still running. So you can get a short-term memory spike because of that until the old worker processes exit.

If that is the case, I suggest increasing the memory limit.

Also, note that by default NGINX will allocate as many worker processes as there are cores on the node. In your case of 16 CPUs nodes, this means NGINX will create 16 workers. More workers = more memory. You can control the number of worker processes by configuring worker-processes configmap key https://docs.nginx.com/nginx-ingress-controller/configuration/global-configuration/configmap-resource/#general-customization .

pleshakov avatar Jun 10 '22 19:06 pleshakov

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Sep 09 '22 02:09 github-actions[bot]

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Dec 10 '22 01:12 github-actions[bot]

This issue was closed because it has been stalled for 10 days with no activity.

github-actions[bot] avatar Dec 21 '22 01:12 github-actions[bot]