ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Retry loop on keep-alive connection to upstream servers

Open zedge-it opened this issue 2 years ago • 6 comments

What happened:

Ingress-controller is handling a bad request from a client, gets an empty reply from an upstream server, and ends up in a retry loop until the client disconnects.

This only happened when the ingress-controller had open keep-alive connections to the upstream servers.

What you expected to happen:

The ingress-controller should do 3 (proxy-next-upstream-tries) retries and then return a 502 to the client.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
  Release:       v1.5.1
  Build:         d003aae913cc25f375deb74f898c7f3c65c06f05
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Kubernetes version (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.12-gke.1600", GitCommit:"5b5d8c9b3bf9c7e4b50c276f4d165d176e310dfe", GitTreeState:"clean", BuildDate:"2022-10-13T09:30:22Z", GoVersion:"go1.17.13b7", Compiler:"gc", Platform:"linux/amd64"}

How to reproduce this issue:

  1. Use minikube
minikube start
  1. Install ingress-nginx
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade -i ingress-nginx ingress-nginx/ingress-nginx --version 4.4.0 -f values-ingress-nginx.yaml
  • values-ingress-nginx.yaml:
controller:
  config:
    enable-vts-status: "true"
    vts-status-zone-size: "32m"
    hsts: "false"
    hsts-include-subdomains: "false"
    use-geoip: "false"
    use-geoip2: "false"
    use-forwarded-headers: "true"
    map-hash-bucket-size: "128"
    use-gzip: "true"
  electionID: nginx-ingress-controller-leader
  ingressClass: nginx
  publishService:
    enabled: true
  podAnnotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
  service:
    enabled: true
    type: NodePort
    externalTrafficPolicy: Local
revisionHistoryLimit: 5
  1. Install upstream service
helm repo add bitnami https://charts.bitnami.com/bitnami
helm upgrade -i nginx-empty-replyserver bitnami/nginx --version 13.2.13 -f values-nginx-empty-replyserver.yaml
  • values-nginx-empty-replyserver.yaml:
replicaCount: 3
service:
  type: ClusterIP
ingress:
  enabled: true
  hostname: foo.bar
  annotations:
    kubernetes.io/ingress.class: nginx
serverBlock: |-
  server {
    listen  8080;

    location /die {
      return 444;
    }

    location /stub_status {
      stub_status on;
    }
  }

This service has an endpoint (/) that returns 200 OK, and an endpoint (/die) that returns an empty reply.

  1. Create minikube service tunnel
minikube service ingress-nginx-controller
|-----------|--------------------------|-------------|------------------------|
| NAMESPACE |           NAME           | TARGET PORT |          URL           |
|-----------|--------------------------|-------------|------------------------|
| default   | ingress-nginx-controller |             | http://127.0.0.1:53482 |
|           |                          |             | http://127.0.0.1:53483 |
|-----------|--------------------------|-------------|------------------------|
  1. Warm up the keep-alive connections with valid 200 OK requests
wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://127.0.0.1:53482/
  1. Before the idle keep-alive connections are closed (60s), send a request that returns an empty reply
curl -H 'Host: foo.bar' http://127.0.0.1:53482/die

This will hang until you stop it.

Anything else we need to know:

If you disable keep-alive connections in the ingress-controller config, it will retry 3 times and return "502 Bad Gateway" as expected

upstream-keepalive-requests: 0
upstream-keepalive-timeout: 0

zedge-it avatar Nov 23 '22 22:11 zedge-it

@zedge-it: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Nov 23 '22 22:11 k8s-ci-robot

/assign @strongjz

Can you post the ingress object as well?

If ingress-nginx gets a request for a service it doesnt have it will send it to the default backend.

strongjz avatar Dec 08 '22 16:12 strongjz

/triage needs-information

strongjz avatar Dec 08 '22 16:12 strongjz

$ kubectl get ingress nginx-empty-replyserver -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    meta.helm.sh/release-name: nginx-empty-replyserver
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2022-11-22T11:49:13Z"
  generation: 2
  labels:
    app.kubernetes.io/instance: nginx-empty-replyserver
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: nginx
    helm.sh/chart: nginx-13.2.13
  name: nginx-empty-replyserver
  namespace: default
  resourceVersion: "11762"
  uid: e0ca2334-82f7-4d99-968a-6098006cc094
spec:
  rules:
  - host: foo.bar
    http:
      paths:
      - backend:
          service:
            name: nginx-empty-replyserver
            port:
              name: http
        path: /
        pathType: ImplementationSpecific
status:
  loadBalancer:
    ingress:
    - ip: 10.98.231.89

zedge-it avatar Dec 09 '22 07:12 zedge-it

you can disable the multiple retry (default is 3) connection to upstream with below annotation/configuration

nginx.ingress.kubernetes.io/proxy-next-upstream-tries: '1'

roorkrn avatar Dec 07 '23 19:12 roorkrn

I met the same problem and would like to know what is the root cause of this? @strongjz

thanks~

zengyuxing007 avatar May 07 '24 05:05 zengyuxing007

Hi,

The description for the response code 444 is as follows ;

A non-standard status code used to instruct nginx to close the connection without sending a response to the client, most commonly used to deny malicious or malformed requests.

I tested this topic without any customization whatsoever anywhere and used vanilla nginx:alpine image to create a backend upstream. And as expected there was no retry loop.

Then I tested with the suggested backend that sends a 444 and the connection never closed, as described.

% date && time wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip`
Mon Sep  9 01:25:22 PM IST 2024
Running 5s test @ http://192.168.49.2
  2 threads and 5 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    14.57ms   22.09ms  94.25ms   83.60%
    Req/Sec   730.52    166.98     1.54k    81.00%
  7286 requests in 5.01s, 5.98MB read
Requests/sec:   1453.19
Transfer/sec:      1.19MB
wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip`  0.05s user 0.29s system 6% cpu 5.020 total
[~] 
% date && time curl -H 'Host: foo.bar' `minikube ip`/die                 
Mon Sep  9 01:25:30 PM IST 2024
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
curl -H 'Host: foo.bar' `minikube ip`/die  0.00s user 0.01s system 70% cpu 0.012 total
[~] 
% k describe ing nginx-empty-replyserver 
Name:             nginx-empty-replyserver
Labels:           app.kubernetes.io/instance=nginx-empty-replyserver
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=nginx
                  app.kubernetes.io/version=1.27.1
                  helm.sh/chart=nginx-18.1.11
Namespace:        default
Address:          192.168.49.2
Ingress Class:    nginx
Default backend:  <default>
Rules:
  Host        Path  Backends
  ----        ----  --------
  foo.bar     
              /   nginx-empty-replyserver:http (10.244.0.59:8080)
Annotations:  meta.helm.sh/release-name: nginx-empty-replyserver
              meta.helm.sh/release-namespace: default
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    14m (x3 over 16m)  nginx-ingress-controller  Scheduled for sync

I did not experience a hang.

On one hand, I think the issue desciption needs editing if we are working on this. Latest controller with a test on minikube cluster. And all details are needed.

But on another hand there is this configuration available as mentioned by @zedge-it https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#upstream-keepalive-timeout .

So as of today, after 2 years, the status is that NOT using the configMap for setting timout to 0 and yet expecting that there will be no retry loop, is a matter of discussion. Specifically, with wek like tool causing lot of connections, it is implied that the conntrack and tcp window buffer size may get exhausted. This is simply from the fact that in the absense of a reply from the upstream, the controller will have tons & tons of ESTABLISHED connections concurrently, that will wait for the default timeperiod of 300 odd seconds to change to CLOSE_WAIT state.

Even otherwise, its not possible to implement such expectations because the upstream nginx software provides the timeout option and that has been implemented as a configMap entry, in the context that the long wait for 300 or whatever default timeperiod for a ESTABLISHED connection to the upstream, while using wrk like flood, with a server pre-configured to pre-determined to send back a response code of 444 , is not really a core area for a ingress-controller. The controller project has to focus on the core Ingress-API functionality while ensuring the controller is secure by default, out of th ebox. This is because there is not many resources to work on non Ingress-API features. We are actually deprecating many less used non-Ingress-API features.

/kind feature

Hence I am closing this issue with clear messaging that using the timeout configMap setting is the best feasible solution for the wrk Flood use-case with a server respond 444. Thanks.

/close

longwuyuan avatar Sep 09 '24 08:09 longwuyuan

@longwuyuan: Closing this issue.

In response to this:

Hi,

The description for the response code 444 is as follows ;

A non-standard status code used to instruct nginx to close the connection without sending a response to the client, most commonly used to deny malicious or malformed requests.

I tested this topic without any customization whatsoever anywhere and used vanilla nginx:alpine image to create a backend upstream. And as expected there was no retry loop.

Then I tested with the suggested backend that sends a 444 and the connection never closed, as described.

% date && time wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip`
Mon Sep  9 01:25:22 PM IST 2024
Running 5s test @ http://192.168.49.2
 2 threads and 5 connections
 Thread Stats   Avg      Stdev     Max   +/- Stdev
   Latency    14.57ms   22.09ms  94.25ms   83.60%
   Req/Sec   730.52    166.98     1.54k    81.00%
 7286 requests in 5.01s, 5.98MB read
Requests/sec:   1453.19
Transfer/sec:      1.19MB
wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip`  0.05s user 0.29s system 6% cpu 5.020 total
[~] 
% date && time curl -H 'Host: foo.bar' `minikube ip`/die                 
Mon Sep  9 01:25:30 PM IST 2024
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
curl -H 'Host: foo.bar' `minikube ip`/die  0.00s user 0.01s system 70% cpu 0.012 total
[~] 
% k describe ing nginx-empty-replyserver 
Name:             nginx-empty-replyserver
Labels:           app.kubernetes.io/instance=nginx-empty-replyserver
                 app.kubernetes.io/managed-by=Helm
                 app.kubernetes.io/name=nginx
                 app.kubernetes.io/version=1.27.1
                 helm.sh/chart=nginx-18.1.11
Namespace:        default
Address:          192.168.49.2
Ingress Class:    nginx
Default backend:  <default>
Rules:
 Host        Path  Backends
 ----        ----  --------
 foo.bar     
             /   nginx-empty-replyserver:http (10.244.0.59:8080)
Annotations:  meta.helm.sh/release-name: nginx-empty-replyserver
             meta.helm.sh/release-namespace: default
Events:
 Type    Reason  Age                From                      Message
 ----    ------  ----               ----                      -------
 Normal  Sync    14m (x3 over 16m)  nginx-ingress-controller  Scheduled for sync

I did not experience a hang.

On one hand, I think the issue desciption needs editing if we are working on this. Latest controller with a test on minikube cluster. And all details are needed.

But on another hand there is this configuration available as mentioned by @zedge-it https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#upstream-keepalive-timeout .

So as of today, after 2 years, the status is that NOT using the configMap for setting timout to 0 and yet expecting that there will be no retry loop, is a matter of discussion. Specifically, with wek like tool causing lot of connections, it is implied that the conntrack and tcp window buffer size may get exhausted. This is simply from the fact that in the absense of a reply from the upstream, the controller will have tons & tons of ESTABLISHED connections concurrently, that will wait for the default timeperiod of 300 odd seconds to change to CLOSE_WAIT state.

Even otherwise, its not possible to implement such expectations because the upstream nginx software provides the timeout option and that has been implemented as a configMap entry, in the context that the long wait for 300 or whatever default timeperiod for a ESTABLISHED connection to the upstream, while using wrk like flood, with a server pre-configured to pre-determined to send back a response code of 444 , is not really a core area for a ingress-controller. The controller project has to focus on the core Ingress-API functionality while ensuring the controller is secure by default, out of th ebox. This is because there is not many resources to work on non Ingress-API features. We are actually deprecating many less used non-Ingress-API features.

/kind feature

Hence I am closing this issue with clear messaging that using the timeout configMap setting is the best feasible solution for the wrk Flood use-case with a server respond 444. Thanks.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 09 '24 08:09 k8s-ci-robot