ingress-nginx
ingress-nginx copied to clipboard
Retry loop on keep-alive connection to upstream servers
What happened:
Ingress-controller is handling a bad request from a client, gets an empty reply from an upstream server, and ends up in a retry loop until the client disconnects.
This only happened when the ingress-controller had open keep-alive connections to the upstream servers.
What you expected to happen:
The ingress-controller should do 3 (proxy-next-upstream-tries) retries and then return a 502 to the client.
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
NGINX Ingress controller
Release: v1.5.1
Build: d003aae913cc25f375deb74f898c7f3c65c06f05
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6
Kubernetes version (use kubectl version
):
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.12-gke.1600", GitCommit:"5b5d8c9b3bf9c7e4b50c276f4d165d176e310dfe", GitTreeState:"clean", BuildDate:"2022-10-13T09:30:22Z", GoVersion:"go1.17.13b7", Compiler:"gc", Platform:"linux/amd64"}
How to reproduce this issue:
- Use minikube
minikube start
- Install ingress-nginx
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade -i ingress-nginx ingress-nginx/ingress-nginx --version 4.4.0 -f values-ingress-nginx.yaml
- values-ingress-nginx.yaml:
controller:
config:
enable-vts-status: "true"
vts-status-zone-size: "32m"
hsts: "false"
hsts-include-subdomains: "false"
use-geoip: "false"
use-geoip2: "false"
use-forwarded-headers: "true"
map-hash-bucket-size: "128"
use-gzip: "true"
electionID: nginx-ingress-controller-leader
ingressClass: nginx
publishService:
enabled: true
podAnnotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
service:
enabled: true
type: NodePort
externalTrafficPolicy: Local
revisionHistoryLimit: 5
- Install upstream service
helm repo add bitnami https://charts.bitnami.com/bitnami
helm upgrade -i nginx-empty-replyserver bitnami/nginx --version 13.2.13 -f values-nginx-empty-replyserver.yaml
- values-nginx-empty-replyserver.yaml:
replicaCount: 3
service:
type: ClusterIP
ingress:
enabled: true
hostname: foo.bar
annotations:
kubernetes.io/ingress.class: nginx
serverBlock: |-
server {
listen 8080;
location /die {
return 444;
}
location /stub_status {
stub_status on;
}
}
This service has an endpoint (/) that returns 200 OK, and an endpoint (/die) that returns an empty reply.
- Create minikube service tunnel
minikube service ingress-nginx-controller
|-----------|--------------------------|-------------|------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|-----------|--------------------------|-------------|------------------------|
| default | ingress-nginx-controller | | http://127.0.0.1:53482 |
| | | | http://127.0.0.1:53483 |
|-----------|--------------------------|-------------|------------------------|
- Warm up the keep-alive connections with valid 200 OK requests
wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://127.0.0.1:53482/
- Before the idle keep-alive connections are closed (60s), send a request that returns an empty reply
curl -H 'Host: foo.bar' http://127.0.0.1:53482/die
This will hang until you stop it.
Anything else we need to know:
If you disable keep-alive connections in the ingress-controller config, it will retry 3 times and return "502 Bad Gateway" as expected
upstream-keepalive-requests: 0
upstream-keepalive-timeout: 0
@zedge-it: This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/assign @strongjz
Can you post the ingress object as well?
If ingress-nginx gets a request for a service it doesnt have it will send it to the default backend.
/triage needs-information
$ kubectl get ingress nginx-empty-replyserver -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
meta.helm.sh/release-name: nginx-empty-replyserver
meta.helm.sh/release-namespace: default
creationTimestamp: "2022-11-22T11:49:13Z"
generation: 2
labels:
app.kubernetes.io/instance: nginx-empty-replyserver
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: nginx
helm.sh/chart: nginx-13.2.13
name: nginx-empty-replyserver
namespace: default
resourceVersion: "11762"
uid: e0ca2334-82f7-4d99-968a-6098006cc094
spec:
rules:
- host: foo.bar
http:
paths:
- backend:
service:
name: nginx-empty-replyserver
port:
name: http
path: /
pathType: ImplementationSpecific
status:
loadBalancer:
ingress:
- ip: 10.98.231.89
you can disable the multiple retry (default is 3) connection to upstream with below annotation/configuration
nginx.ingress.kubernetes.io/proxy-next-upstream-tries: '1'
I met the same problem and would like to know what is the root cause of this? @strongjz
thanks~
Hi,
The description for the response code 444 is as follows ;
A non-standard status code used to instruct nginx to close the connection without sending a response to the client, most commonly used to deny malicious or malformed requests.
I tested this topic without any customization whatsoever anywhere and used vanilla nginx:alpine image to create a backend upstream. And as expected there was no retry loop.
Then I tested with the suggested backend that sends a 444 and the connection never closed, as described.
% date && time wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip`
Mon Sep 9 01:25:22 PM IST 2024
Running 5s test @ http://192.168.49.2
2 threads and 5 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 14.57ms 22.09ms 94.25ms 83.60%
Req/Sec 730.52 166.98 1.54k 81.00%
7286 requests in 5.01s, 5.98MB read
Requests/sec: 1453.19
Transfer/sec: 1.19MB
wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip` 0.05s user 0.29s system 6% cpu 5.020 total
[~]
% date && time curl -H 'Host: foo.bar' `minikube ip`/die
Mon Sep 9 01:25:30 PM IST 2024
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
curl -H 'Host: foo.bar' `minikube ip`/die 0.00s user 0.01s system 70% cpu 0.012 total
[~]
% k describe ing nginx-empty-replyserver
Name: nginx-empty-replyserver
Labels: app.kubernetes.io/instance=nginx-empty-replyserver
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nginx
app.kubernetes.io/version=1.27.1
helm.sh/chart=nginx-18.1.11
Namespace: default
Address: 192.168.49.2
Ingress Class: nginx
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
foo.bar
/ nginx-empty-replyserver:http (10.244.0.59:8080)
Annotations: meta.helm.sh/release-name: nginx-empty-replyserver
meta.helm.sh/release-namespace: default
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 14m (x3 over 16m) nginx-ingress-controller Scheduled for sync
I did not experience a hang.
On one hand, I think the issue desciption needs editing if we are working on this. Latest controller with a test on minikube cluster. And all details are needed.
But on another hand there is this configuration available as mentioned by @zedge-it https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#upstream-keepalive-timeout .
So as of today, after 2 years, the status is that NOT using the configMap for setting timout to 0 and yet expecting that there will be no retry loop, is a matter of discussion. Specifically, with wek like tool causing lot of connections, it is implied that the conntrack and tcp window buffer size may get exhausted. This is simply from the fact that in the absense of a reply from the upstream, the controller will have tons & tons of ESTABLISHED connections concurrently, that will wait for the default timeperiod of 300 odd seconds to change to CLOSE_WAIT state.
Even otherwise, its not possible to implement such expectations because the upstream nginx software provides the timeout option and that has been implemented as a configMap entry, in the context that the long wait for 300 or whatever default timeperiod for a ESTABLISHED connection to the upstream, while using wrk like flood, with a server pre-configured to pre-determined to send back a response code of 444 , is not really a core area for a ingress-controller. The controller project has to focus on the core Ingress-API functionality while ensuring the controller is secure by default, out of th ebox. This is because there is not many resources to work on non Ingress-API features. We are actually deprecating many less used non-Ingress-API features.
/kind feature
Hence I am closing this issue with clear messaging that using the timeout configMap setting is the best feasible solution for the wrk Flood use-case with a server respond 444. Thanks.
/close
@longwuyuan: Closing this issue.
In response to this:
Hi,
The description for the response code 444 is as follows ;
A non-standard status code used to instruct nginx to close the connection without sending a response to the client, most commonly used to deny malicious or malformed requests.
I tested this topic without any customization whatsoever anywhere and used vanilla nginx:alpine image to create a backend upstream. And as expected there was no retry loop.
Then I tested with the suggested backend that sends a 444 and the connection never closed, as described.
% date && time wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip` Mon Sep 9 01:25:22 PM IST 2024 Running 5s test @ http://192.168.49.2 2 threads and 5 connections Thread Stats Avg Stdev Max +/- Stdev Latency 14.57ms 22.09ms 94.25ms 83.60% Req/Sec 730.52 166.98 1.54k 81.00% 7286 requests in 5.01s, 5.98MB read Requests/sec: 1453.19 Transfer/sec: 1.19MB wrk -c 5 -d 5 -t 2 -H 'Host: foo.bar' http://`minikube ip` 0.05s user 0.29s system 6% cpu 5.020 total [~] % date && time curl -H 'Host: foo.bar' `minikube ip`/die Mon Sep 9 01:25:30 PM IST 2024 <html> <head><title>502 Bad Gateway</title></head> <body> <center><h1>502 Bad Gateway</h1></center> <hr><center>nginx</center> </body> </html> curl -H 'Host: foo.bar' `minikube ip`/die 0.00s user 0.01s system 70% cpu 0.012 total [~] % k describe ing nginx-empty-replyserver Name: nginx-empty-replyserver Labels: app.kubernetes.io/instance=nginx-empty-replyserver app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=nginx app.kubernetes.io/version=1.27.1 helm.sh/chart=nginx-18.1.11 Namespace: default Address: 192.168.49.2 Ingress Class: nginx Default backend: <default> Rules: Host Path Backends ---- ---- -------- foo.bar / nginx-empty-replyserver:http (10.244.0.59:8080) Annotations: meta.helm.sh/release-name: nginx-empty-replyserver meta.helm.sh/release-namespace: default Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 14m (x3 over 16m) nginx-ingress-controller Scheduled for sync
I did not experience a hang.
On one hand, I think the issue desciption needs editing if we are working on this. Latest controller with a test on minikube cluster. And all details are needed.
But on another hand there is this configuration available as mentioned by @zedge-it https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#upstream-keepalive-timeout .
So as of today, after 2 years, the status is that NOT using the configMap for setting timout to 0 and yet expecting that there will be no retry loop, is a matter of discussion. Specifically, with wek like tool causing lot of connections, it is implied that the conntrack and tcp window buffer size may get exhausted. This is simply from the fact that in the absense of a reply from the upstream, the controller will have tons & tons of ESTABLISHED connections concurrently, that will wait for the default timeperiod of 300 odd seconds to change to CLOSE_WAIT state.
Even otherwise, its not possible to implement such expectations because the upstream nginx software provides the timeout option and that has been implemented as a configMap entry, in the context that the long wait for 300 or whatever default timeperiod for a ESTABLISHED connection to the upstream, while using wrk like flood, with a server pre-configured to pre-determined to send back a response code of 444 , is not really a core area for a ingress-controller. The controller project has to focus on the core Ingress-API functionality while ensuring the controller is secure by default, out of th ebox. This is because there is not many resources to work on non Ingress-API features. We are actually deprecating many less used non-Ingress-API features.
/kind feature
Hence I am closing this issue with clear messaging that using the timeout configMap setting is the best feasible solution for the wrk Flood use-case with a server respond 444. Thanks.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.