ingress-nginx
ingress-nginx copied to clipboard
since controller 1.10.0 (chart 4.10.0): ingress rejects duplicate "Transfer-Encoding: chunked" and returns 502
What happened: when controller from 1.9.6 (chart 4.9.1) to 1.10.0 (chart 4.10.0), we observe ingress answering 502 on behalve of the service. When rolling back to 1.9.6, restored to healthty behaviour
What you expected to happen: ingress behaving similar to 1.9.6 and not answering 502 where it did not before
there appears to be a regression of some sort. We have updated from 1.9.6 to 1.10.0 and observed that some requests return 502 right when we upgraded. We then downgraded and saw the 502s drop to previous numbers.
One instance where we observe it is when setting Transfer-Encoding: chunked twice, once in code and once via Spring Boot.
We also observe following error message in the logs
ingress-nginx-controller-8bf5b5f98-w8gbz 2024/03/22 11:01:54 [error] 575#575: *28680 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 127.0.0.1, server: localhost, request: "POST /retrieve HTTP/1.1", upstream: "http://10.26.195.10:8080/retrieve", host: "localhost:8076"
ingress-nginx-controller-8bf5b5f98-w8gbz 127.0.0.1 - - [22/Mar/2024:11:01:54 +0000] "POST /retrieve HTTP/1.1" 502 150 "-" "PostmanRuntime/7.36.3" 3208 2.317 [default-ebilling-retrieval-http] ] 10.26.195.10:8080 0 2.318 502 94bc05d81342c91791fac0f02cb64434
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): 1.25.3
/etc/nginx $ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.10.0
Build: 71f78d49f0a496c31d4c19f095469f3f23900f8a
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.3
-------------------------------------------------------------------------------
/etc/nginx $
Kubernetes version (use kubectl version): v1.28.5
Environment:
- Cloud provider or hardware configuration: Azure AKS
- OS (e.g. from /etc/os-release): Ubuntu on nodes
/etc/nginx $ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.19.1
PRETTY_NAME="Alpine Linux v3.19"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
/etc/nginx $
- Kernel (e.g.
uname -a):
/etc/nginx $ uname -a
Linux ingress-nginx-controller-8bf5b5f98-w8gbz 5.15.0-1057-azure #65-Ubuntu SMP Fri Feb 9 18:39:24 UTC 2024 x86_64 Linux
/etc/nginx $
- Install tools: AKS via terraform, ingress via helm (but also via terraform)
Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
- Basic cluster related info:
kubectl version
$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-14T09:53:42Z", GoVers
ion:"go1.20.5", Compiler:"gc", Platform:"windows/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.5", GitCommit:"506050d61cf291218dfbd41ac93913945c9aa0da", GitTreeState:"clean", BuildDate:"2023-12-23T00:10:25Z", GoVers
ion:"go1.20.12", Compiler:"gc", Platform:"linux/amd64"}
kubectl get nodes -o wide
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-default-15207550-vmss000000 Ready agent 7d2h v1.28.5 10.26.194.4 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool001-49772321-vmss000000 Ready agent 7d1h v1.28.5 10.26.195.9 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool001-49772321-vmss000001 Ready agent 7d1h v1.28.5 10.26.194.207 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool001-49772321-vmss00000b Ready agent 7d1h v1.28.5 10.26.194.33 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool002-37360131-vmss00000h Ready agent 7d v1.28.5 10.26.194.91 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool002-37360131-vmss00000q Ready agent 7d v1.28.5 10.26.194.120 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
aks-pool002-37360131-vmss00000v Ready agent 7d v1.28.5 10.26.194.149 <none> Ubuntu 22.04.4 LTS 5.15.0-1057-azure containerd://1.7.7-1
- How was the ingress-nginx-controller installed:
- If helm was used then please show output of
helm ls -A | grep -i ingress
- If helm was used then please show output of
$ helm ls -A | grep -i ingress
ingress-nginx ap-system 1 2024-03-21 08:44:29.913568481 +0000 UTC deployed ingress-nginx-4.10.0
- If helm was used then please show output of
helm -n <ingresscontrollernamespace> get values <helmreleasename>
$ helm -n ap-system get values ingress-nginx
USER-SUPPLIED VALUES:
controller:
ingressClassResource:
name: nginx
service:
type: ClusterIP
-
If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
-
if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances
-
Current State of the controller:
kubectl describe ingressclasses
$ kubectl describe ingressclasses
Name: nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.10.0
helm.sh/chart=ingress-nginx-4.10.0
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ap-system
Controller: k8s.io/ingress-nginx
Events: <none>
-
kubectl -n <ingresscontrollernamespace> get all -A -o wide -
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname> -
kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename> -
Current state of ingress object, if applicable:
kubectl -n <appnamespace> get all,ing -o widekubectl -n <appnamespace> describe ing <ingressname>- If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag
-
Others:
- Any other related information like ;
- copy/paste of the snippet (if applicable)
kubectl describe ...of any custom configmap(s) created and in use- Any other related information that may help
- Any other related information like ;
How to reproduce this issue:
we have a service that somestimes anwers with 2x Transfer-Encoding: chunked , sometimes with 1x Transfer-Encoding: chunked, when it answers with 2x the header, the request does not reach the client and ingress answers with 502
we hook into the ingress via port-forward to reproduce it, but have no better setup for local reproducing
kubectl port-forward -n ap-system ingress-nginx-controller-8bf5b5f98-w8gbz 8076 8080
-
<send requests via postman> -
observe 502 & log when pod answers with duplicate header
Transfer-Encoding: chunked, regular 200 when answering with one header
We have also observed this for other clients with other backends, but so far have only particular endpoint reproduced with "always errors when X and always ok when Y"
Anything else we need to know:
we now have deployed a seperate ingress @ 1.10.0 (with seperate ingress class) and can observe this behaviour while our "hot" ingress that gets traffik is back to 1.9.6, the 1.10.0 still breaks while we have an operational ingress with 1.9.6. It does sound somewhat similar to https://github.com/spring-projects/spring-boot/issues/37646
/remove-kind bug
- I just tested controller v1.10 and I have used v1.10 since its release and I don't see this problem.
- You are using service type as ClsuterIP so that is a uncommon use-case. It implies that there is no service type LoadBalancer where the connection from outside the cluster terminates
- You are referring to headers and HTTP specs like transfer-encoding. While another user of same use-case and experts would comment on this, it does not give actionable detailed information with regards to a bug
- You have not provided critical details tht are asked in the issue template so don't know if you are using snippets
Its better if you write a step-by-step procedure that someone can copy/paste from. Please use the image httpbun.org as it will get people on the same page. Please use minikube to reproduce the problem. Ensure you provide all manifests that someone can copy/paste and reproduce.
/kind support /triage needs-information
@longwuyuan I have the same problem like @DRoppelt in 1.10.0 release on bare-metal cluster (v1.28.6) with LoadBalancer service type (MetalLB v0.14.3). On 1.9.6 version works without any problems.
@AlexPokatilov thanks for updating.
- I it possible to reproduce this on minikube, using a backend created from httpbun.org
- With a simple ingress for that httpbun.org backend, all my requests are success so I suspect I will need your ingress reported here as output of
kubectl describe - Also if all the kubectl describe outputs as asked in the template of a new issue are provided, it would have thrown some info on the critical info required to analyze
Do you happen to have a sample of a ticket where someone reproduced a bug that was based on the server's result? Part of the reproducing-setup would be to have a backend behind ingress to respond a certain way to showcase the setup in a minikube.
I could create a spring-boot app that does that & create an image locally & deploy that, but that sounds like a lot of overhead for someone to reproduce that does not happen to have mvn&jdk locally
Also you will see, there are no logs in the issue description. So essentially the info is already asked in issue template but I will list again (Not YAML but current state of resources shown as kubectl describe output and also the logs of the controller pod)
- kubectl describe of controller related resources
- kubectl describe of app resources like service and ingress
- Actual curl as executed with -v in complete exact detail
- Actual logs of the controller pod(s)
- Kubectl get events -A (if related )
I will get back to you tomorrow and supply missing info. Sorry for that.
I have used the template but omitted some of the parts that seemed less relevant but obviously were more relevant than I initially thought.
@DRoppelt thanks for the idea.
By discussing here or by copy/pasting data, if we can triage the issue description to an aspect that is related to reverseproxy or ingress-api, it would help make progress.
If we make it a maven/jdk/jre/jsp specific discussion, then later in the discussion we will arrive at a step where we pinpoint at what reverseproxy concept or what nginx-as-a-reverseproxy concept we are looking at. In addition to what K8S-Ingress-API related problem are we looking at.
In this case, it seems you are pointing at your java server setting "Transfer-Encoding: chunked". If so, we need to clearly establish of the ingress-controller is doing something to that header in rejecting the response. If so, then the prime suspects are nginx v1.25 which is the upgrade to nginx v1.21 in controller v1.9.x .
I am wondering why should nginx reverseproxy bother what value you set to that header. We don't have any business messing with that. I don't know if you meant you were setting that header in your code and in the springboot framework. Either way, i know for sure that the controller does not get in the way normally. Hence we need to deep dive.
BUT a clear issue description comes first. And the specific test where we alter headers comes next.
Ok, so since the upgrade to controller 1.10.0 (chart 4.10.0), we observe 502s (not always, but at least one case where we can reproduce it deterministically)
here is an attempt to visualize:
client--(1)->nginx--(2)->backend
client<-(4)--nginx<-(3)--backend
given that
(1)is a POST(3)has response-headerTransfer-Encoding: chunkedset twice
then nginx appears to intercept the original response of (3) and answers on behalf of backend with (4) with a response-code 502
We also observe following log with matching timestamps
ingress-nginx-controller-8bf5b5f98-w8gbz 2024/03/22 11:01:54 [error] 575#575: *28680 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 127.0.0.1, server: localhost, request: "POST /retrieve HTTP/1.1", upstream: "http://10.26.195.10:8080/retrieve", host: "localhost:8076"
ingress-nginx-controller-8bf5b5f98-w8gbz 127.0.0.1 - - [22/Mar/2024:11:01:54 +0000] "POST /retrieve HTTP/1.1" 502 150 "-" "PostmanRuntime/7.36.3" 3208 2.317 [default-ebilling-retrieval-http] ] 10.26.195.10:8080 0 2.318 502 94bc05d81342c91791fac0f02cb64434
We found that when altering backend to set the header once (as the backend probably should anyways), the response goes through as (4) to the client as 200. The backend setting the header twice is an odd behaviour, but previously went through where now it fails. We have reproduced this in an AKS cluster but not yet fully locally.
resproducing
I tried to setup minikube and get a sample going with a spring boot app, but it appears that minikube comes with controller 1.9.4 and I found no way to pin it to 1.10.0, please advice.
Do you have a sample ticket with httpbun.org I could copy from? I have a hard time understanding how to setup a sample based on that?
- httpbun image contains a compiled (go) binary so not eas to change it to set
Transfer-Encoding: chunked - Run your java app in minikube
- BUT create minikube instance with a VM instead of docker driver
- Install metallb from metallb.org
- Set the
minikube ipoutput as the L2 ipaddress pool starting and ending. This will assign the same ip to the ingress-nginx - Install ingress-nginx using helm on minikube
- Make /etc/hosts entry for the minikube ip to point to your app ingress hostname
Personally I am going to just use a nginx:alpine image to create a backend and see if I can just add the header to nginx.conf in that pod
I see this ;
% curl -L httpbun.dev.enjoydevops.com/headers -D t.txt
{
"Accept": "*/*",
"Host": "httpbun.dev.enjoydevops.com",
"User-Agent": "curl/7.81.0",
"X-Forwarded-For": "192.168.122.1",
"X-Forwarded-Host": "httpbun.dev.enjoydevops.com",
"X-Forwarded-Port": "443",
"X-Forwarded-Proto": "https",
"X-Forwarded-Scheme": "https",
"X-Real-Ip": "192.168.122.1",
"X-Request-Id": "4cff8aa1823b57b21487300b7a6e2505",
"X-Scheme": "https"
}
[~]
% less t.txt
[~]
% cat t.txt
HTTP/1.1 308 Permanent Redirect
Date: Mon, 25 Mar 2024 19:21:33 GMT
Content-Type: text/html
Content-Length: 164
Connection: keep-alive
Location: https://httpbun.dev.enjoydevops.com/headers
HTTP/2 200
date: Mon, 25 Mar 2024 19:21:33 GMT
content-type: application/json
content-length: 388
x-powered-by: httpbun/5025308c3a9df224c10faae403ae888ad5c3ecc5
strict-transport-security: max-age=31536000; includeSubDomains
So it will be interesting to see your curl and same info.
Trying to co-relate if nginx v1.25 does anything different .
Also I read here that HTTP2 does not allow that header and I see HTTP/2 in play in my test above
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding
It is a bit too late (as in day of time in Europe) for me to get a feel for minikube, nevertheless thank you for the steps, will look at them myself tomorrow.
Meanwhile, here is a backend that can be used for testing. chunky-demo.zip
@RestController
@Slf4j
public class ChunkyRestController {
//...
@PostMapping("/chunky/{times}/{addedheader_times}")
public void streamedFile(@PathVariable Integer times,@PathVariable Integer addedheader_times, HttpServletResponse response) throws IOException, InterruptedException {
for (int i = 0; i < addedheader_times; i++) {
log.info("added chunking header for the {}th time", i);
response.addHeader("Transfer-Encoding", "chunked");
}
PrintWriter writer = response.getWriter();
for (int i = 0; i < times; i++) {
writer.println(i);
writer.flush();
log.info("flushed, waiting before flushing again {}/{}", i, times+1);
int waitIntervalMs = 100;
Thread.sleep(waitIntervalMs);
}
}
}
$ podman build --tag chunky-demo -f Dockerfile && podman run -p 8080:8080 chunky-demo
...
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v3.2.4)
2024-03-25T20:13:27.691Z INFO 1 --- [demo] [ main] com.example.demo.Application : Starting Application v0.0.1-SNAPSHOT using Java 21.0.2 with PID 1 (/app/app.jar started by root in /app)
2024-03-25T20:13:27.698Z INFO 1 --- [demo] [ main] com.example.demo.Application : No active profile set, falling back to 1 default profile: "default"
2024-03-25T20:13:29.230Z INFO 1 --- [demo] [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port 8080 (http)
2024-03-25T20:13:29.246Z INFO 1 --- [demo] [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2024-03-25T20:13:29.246Z INFO 1 --- [demo] [ main] o.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/10.1.19]
2024-03-25T20:13:29.296Z INFO 1 --- [demo] [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2024-03-25T20:13:29.297Z INFO 1 --- [demo] [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1469 ms
2024-03-25T20:13:29.715Z INFO 1 --- [demo] [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path ''
2024-03-25T20:13:29.742Z INFO 1 --- [demo] [ main] com.example.demo.Application : Started Application in 2.634 seconds (process running for 3.243)
curl -X POST localhost:8080/chunky/$chunktimes/$addChunkheaderTimes -v
the api has 2 parameters, it will stream a response in 100ms intervals for $chunktimes and set additional $addChunkheaderTimes headers (one comes from the embedded tomcat itself)
I hope this helps!
➜ ~ curl -X POST localhost:8080/chunky/10/1 -v
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /chunky/10/1 HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.0.1
> Accept: */*
>
< HTTP/1.1 200
< Transfer-Encoding: chunked
< Transfer-Encoding: chunked
< Date: Mon, 25 Mar 2024 20:07:27 GMT
<
0
1
2
3
4
5
6
7
8
9
* Connection #0 to host localhost left intact
@DRoppelt thank you. It really helped.
-
I can reproduce the 502 with the jar and the request you provided
-
This happens in controller v1.10.x because, inside controller v1.10.x, the nginx component of openresty was upgraded to nginx v1.25.x
-
Other people on internet, facing the same prolem, after upgrading to nginx v1.24+, resolved this by setting
"chunked_transfer_encoding off;", in their nginx.conf- These links have the details and the solution https://duckduckgo.com/?t=ffab&q=upstream+sent+duplicate+header+line%3A+%22Transfer-Encoding%3A+chunked%22%2C+previous+value%3A+%22Transfer-Encoding%3A+chunked%22+while+reading+response+header+from+upstream&atb=v390-1&ia=web
-
Currently I think a server-snippet or a configuration-snippet can be used by you to set
"chunked_transfer_encoding off;" -
Please wait for the maintainers to comment on this. Maybe we need to inspect if we should ship the controller with this setting out of the box. Maybe what I am talking about is wrong.
-
I don't know if controller set with
"chunked_transfer_encoding off;"and your app setting""Transfer-Encoding: chunked"is a good combination. I will test this and update
/kind bug /remove-kind support /triage accepted
@tao12345666333 @rikatz @cpanato @strongjz because of the bump to nginx v1.25.x, chunking related behaviour of the controller is broken out of the box. Details above.
Basically internet search says a directive like "chunked_transfer_encoding off;" has to be set, on the controller, if a backend upstream app is setting "Transfer-Encoding: chunked".
Right now, the controller is rejecting response from a backend upstream app with the log message below ;
2024/03/26 03:39:34 [error] 267#267: *358731 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 192.168.122.1, server: chunky-demo.dev.enjoydevops.com, request: "POST /chunky/10/1 HTTP/1.1", upstream: "http://10.244.0.68:8080/chunky/10/1", host: "chunky-demo.dev.enjoydevops.com" 192.168.122.1 - - [26/Mar/2024:03:39:34 +0000] "POST /chunky/10/1 HTTP/1.1" 502 150 "-" "curl/7.81.0" 107 0.006 [default-chunky-demo-8080] [] 10.244.0.68:8080 0 0.006 502 9d060a48534b2dcb3b18c598a5c1ee06 2024/03/26 03:40:04 [error] 267#267: *358980 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 192.168.122.1, server: chunky-demo.dev.enjoydevops.com, request: "POST /chunky/10/1 HTTP/1.1", upstream: "http://10.244.0.68:8080/chunky/10/1", host: "chunky-demo.dev.enjoydevops.com" 192.168.122.1 - - [26/Mar/2024:03:40:04 +0000] "POST /chunky/10/1 HTTP/1.1" 502 150 "-" "curl/7.81.0" 107 0.007 [default-chunky-demo-8080] [] 10.244.0.68:8080 0 0.007 502 91f6129d03e71241cf37446190a46d2b
- The reproduce steps are provided in one of the messages above. I built that image locally and did
docker saveimage to tar , thenminikube image loadinto my minikube
Seems related https://github.com/kubernetes/ingress-nginx/issues/4838
@DRoppelt I think I have understood the problem and the solution
- I removed the header setting code from your example java app
% cat src/main/java/com/example/demo/ChunkyRestController.java
package com.example.demo;
import jakarta.servlet.http.HttpServletResponse;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RestController;
import java.io.IOException;
import java.io.PrintWriter;
@RestController
@Slf4j
public class ChunkyRestController {
public record HelloWord(String message) {
}
@GetMapping("/")
public HelloWord showVetList() {
return new HelloWord("hi");
}
@PostMapping("/chunky/{times}/{addedheader_times}")
public void streamedFile(@PathVariable Integer times,@PathVariable Integer addedheader_times, HttpServletResponse response) throws IOException, InterruptedException {
for (int i = 0; i < addedheader_times; i++) {
log.info("added chunking header for the {}th time", i);
}
PrintWriter writer = response.getWriter();
for (int i = 0; i < times; i++) {
writer.println(i);
writer.flush();
log.info("flushed, waiting before flushing again {}/{}", i, times+1);
int waitIntervalMs = 100;
Thread.sleep(waitIntervalMs);
}
}
}
- I deployed the modified edition of your example app
- The POST method request was a success
% curl -X POST chunkynoheader.dev.enjoydevops.com/chunky/10/1 -v
* Trying 192.168.122.193:80...
* Connected to chunkynoheader.dev.enjoydevops.com (192.168.122.193) port 80 (#0)
> POST /chunky/10/1 HTTP/1.1
> Host: chunkynoheader.dev.enjoydevops.com
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
< Date: Tue, 26 Mar 2024 05:46:55 GMT
< Transfer-Encoding: chunked
< Connection: keep-alive
<
0
1
2
3
4
5
6
7
8
9
* Connection #0 to host chunkynoheader.dev.enjoydevops.com left intact
- So we can research into nginx v121 but one fact is clear now. The chunking is on by default in nginx v1.24+ https://nginx.org/en/docs/http/ngx_http_core_module.html#chunked_transfer_encoding . So you don't need to set the header in your app, if you are using nginx v1.24+ anywhere near your app
I think we need a docs PR to be clear about this to users of controller v1.10.x+
/assign
/remove-triage needs-information /remove-kind bug /kind deprecation
Also looking at your logic ;
for (int i = 0; i < addedheader_times; i++) {
log.info("added chunking header for the {}th time", i);
response.addHeader("Transfer-Encoding", "chunked");
}
logging at info level is not the problem. Setting the same header repeatedly, is not allowed, I think. Nginx requires that header to be set only once per server. But I could be wrong. Some experts have to comment.
That for-loop is just for reproducing purposes, but I agree that this is a way to solve it for the affected systems. In at least one system where we see this behaviour, the header once comes via the embedded-tomcat "from the framework" and once explicitly via application code, resulting in 2 returned headers
Setting the same header repeatedly, is not allowed
that I am also wondering http-headers. At least here, nothing explicitly is statet regarding uniqueness https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding
Only restrictions I can see is that
a) Transfer-Encoding: chunked cannot be used in connection with Content-Length: X
Here is some background where we observe this issue:
a) someode added following around 2018, at the time Spring Boot 1.X was current and 2.0 was just anounced
public ResponseEntity<Resource> retrieve(final RetrievalRequest retrieveRequest) {
...
// we must set this here otherwise spring will set some weird
// content length which we do not know due to the streaming approach
// and we certainly don't want to determine the content length
// for larger zip files as this means that we have to keep everything
// in memory or on disk
//
// if we omit this then spring will get the content length from the zip resource, which is 0
//
// so we have to live with this header being set twice, once by us and once by spring or some backend thereof
headers.add("Transfer-Encoding", "chunked");
...
return new ResponseEntity<>(resource, headers, HttpStatus.OK);
}
b) this might have, at the time, only produced only one response-header at this time. Meanwhile tomcat and spring boot have undergone many iterations. e.g. Tomcat was 8.0 in Boot 1.5, while now we are on Tomcat 10, 2 major versions ahead (just as one sample)
c) (assumption) auto-detection when the framework (either spring-web or tomcat) adds a Transfer-Encoding: chunked was enhanced, we now have the header twice
d) nginx introduced a behaviour where these headers get now rejected by default
e) the reason it gets rejected is a bit unexpected, while I can see that seeing the header twice as the proxy, I have a hard time following why this warrants completely dropping the request and answering with 502 to upstream
2024/03/22 11:01:54 [error] 575#575: *28680 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 127.0.0.1, server: localhost, request: "POST /retrieve HTTP/1.1", upstream: "http://10.26.195.10:8080/retrieve", host: "localhost:8076"
We also see unexpected 502s where this header is not in play, but we have not been able to reproduce this in a test-environment since the rollback. I am trying to focus on proving/reproducing other ones. Ill keep it out of this discussion until I have a setup to showcase those.
We will try to attempt another upgrade with this, thanks for the analysis from your side and this suggestion
Other people on internet, facing the same prolem, after upgrading to nginx v1.24+, resolved this by setting "chunked_transfer_encoding off;", in their nginx.conf
ok, thanks, this latest update from you was very insightful.
-
I only see the error message and I am ignoring all else for now
-
Error says duplicate, so I am completely getting on the same page as you after reading your update. There are multiple possibilities of duplicate setting of the header
-
Now with all the duplicate settings of header, be it in the app and/or additionally in tomcat, I still think that the duplicity complained in the logs was from setting the header on each transfer and I vaguely recall from don't know where, that you can set that header only once per connection. Unless of-course your future tests prove otherwise
-
You have info now to get to a solution so I will not say much more as I am not a developer and I am hoping other experts comment here
-
But if you want, then you can try to put nginx webserver v1.25+ inside your pod, in front of Tomcat (let tomcat be upstream to nginx) and then set the header in that nginx as well. So that gives you header in nginx + header in tomcat. This is to test if 2 times setting header is a problem. This would be different from setting header 2 times, in the same connection (my assumption, I could be wrong)
-
Will see what future updates come from you. But also waiting from experts to confirm that older nginx had chinking off by default, so we can update docs as needed
And yes, I read internet posts about Transfer-Encoding and Content-Length. So while I understood what I read, I will leave it to developers like you and developers of ingress-nginx to comment on that.
Now I read some text that says your tomcat is violating HTTP specs https://nginx.org/en/docs/faq/chunked_encoding_from_backend.html
We have the same 502 issue with v1.10.0. It's not "sometimes" but constantly getting it calling a URL.
Restarting the controller and the services did not help.
Calling a service like: curl -kv https://xxxxxx.com/abc/abc/sdfsdfsdfsdfsdf
< HTTP/2 502
< date: Wed, 27 Mar 2024 09:27:18 GMT
< content-type: text/html
< content-length: 150
< strict-transport-security: max-age=31536000; includeSubDomains
<
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
Calling internally the service at cluster.local from a nearby POD always working, no error.
The weird that in ingress-controller log looks the POD responses with 502, but it's not true, the request did not reach it:
172.16.12.30 - xxxx [27/Mar/2024:09:26:03 +0000] "GET /abc/abc/sdfsdfsdfsdfsdf HTTP/2.0" 502 150 "-" "curl/7.81.0" 177 0.088
[xxxxxxxx-service-8080] [] 172.17.76.52:8080 0 0.088 502 44000**************
The request does not passed to the POD!
Changing the URL to anything like:
https://xxxxxx.com/abc/abc/sdfsdfsdfsdfsdfv1
solves the problem, the request reaches the POD. It's very weird.
We've rolled back to v1.9.6, without any other changes, the problem has been solved.
Calling internally the service at cluster.local from a nearby POD always working, no error.
@tatobi any chance you can submit the POD's response here? One of the 502-cases that we have was linked to the response-header Transfere-Encoding: chunked. We also have other pods that are affected that do not utilize this header explicitly, but I was not yet able to reproduce it on a testing environment (for the lack of testing data with the particular services).
Can you also verify if the pod receives the request at all? e.g. via an access-log or verbose logging?
@DRoppelt thanks. Sure, here it is below.
Verified, our Java based service running in the POD respond with Transfer-Encoding: chunked despite the response just a few character.
> Host: **********.*.svc.cluster.local:8080
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
< Vary: Origin
< Vary: Access-Control-Request-Method
< Vary: Access-Control-Request-Headers
< Transfer-Encoding: chunked
< Date: Wed, 27 Mar 2024 12:**:** GMT
< Keep-Alive: timeout=60
< Connection: keep-alive
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 0
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Frame-Options: DENY
< Content-Type: application/json
< Transfer-Encoding: chunked
< Transfer-Encoding: chunked
...
< Transfer-Encoding: chunked
appears to be the same issue that we were able to reproduce with longwuyuan. Duplicate transfere-encoding header. The ingress pod should also show a time-wise-matching error-log entry. In our services, we had tomcat/boot setting the header & application also setting this header in addition.
I looked into this initially
Other people on internet, facing the same prolem, after upgrading to nginx v1.24+, resolved this by setting "chunked_transfer_encoding off;", in their nginx.conf
but since I have found no corresponding ConfigMap entry (https://docs.nginx.com/nginx-ingress-controller/configuration/global-configuration/configmap-resource/), opted to fix the affected services instead of working with config-snippets.
I am curious to see how this bug continues, but opted to fix that particular backend either way.
@DRoppelt thanks. I see the duplication. IMHO the new nginx version should tolerate it as old ones did. Generally header duplications shouldn't be handled as errors, max a warning. The questions in that case:
- RFC2616 states that header duplications should be accepted. Logically, if your request goes through multiple proxies, multiservice stack, so it shouldn't be handled as an error.
- why older nginx versions tolerate header duplications without any notice and new ones not?
- looking for where is stated that new nginx versions will fail (502) in case of upstream header duplication, cannot find it.
- whether new versions will be fixed in the future or we have to fix one-by-one in our deployments by adding
chunked_transfer_encoding offto configmaps ?
Thanks!
I agree on all terms and thanks for bringing up the RFC.
https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
Multiple message-header fields with the same field-name MAY be
present in a message if and only if the entire field-value for that
header field is defined as a comma-separated list [i.e., #(values)].
It MUST be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message
I would like a maintainer to chime in here and get a different perspective. The link that @longwuyuan shared (https://nginx.org/en/docs/faq/chunked_encoding_from_backend.html) appears to to deal about something nginx before 1.1.4
regarding:
why older nginx versions tolerate header duplications without any notice and new ones not?
the flipped default since 1.24, it seems
https://github.com/kubernetes/ingress-nginx/issues/11162#issuecomment-2019448596 :
The chunking is on by default in nginx v1.24+ https://nginx.org/en/docs/http/ngx_http_core_module.html#chunked_transfer_encoding . So you don't need to set the header in your app, if you are using nginx v1.24+ anywhere near your app