traefik icon indicating copy to clipboard operation
traefik copied to clipboard

Traefik erroneously closes a valid connection

Open apollo13 opened this issue 7 months ago • 1 comments

Welcome!

  • [x] Yes, I've searched similar issues on GitHub and didn't find any.
  • [x] Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

Bear with me, this is a long and tricky one… Our setup involves haproxy -> traefik -> appserver whereas Traefik and the appserver are on the same host (might affect timings…). Interestingly the following problem does not occur if we test directly against Traefik and leave haproxy out. That said we don't see anything in tcpdump that would indicate that this is a haproxy problem, everything points to Traefik.

So on to the bug: Occasionally (well actually quite often, roughly 0.5% of the time) we are seeing

 2025/05/15 12:59:53 reverseproxy.go:677: httputil: ReverseProxy read error during body copy: read tcp 127.0.0.1:58772->127.0.0.1:10990: use of closed network connection

in the logs and as a result of that Traefik closes the client connection from HaProxy. The issue exists since a while but got worse due to https://github.com/traefik/traefik/pull/11129 -- in our specific case (while wrong) the previous behavior of sending an empty chunk provided the correct result by accident :D (go figure…).

So what does happen? Whenever we hit this issue our backend server return a chunked response (it always returns chunked responses) with two chunks: one data chunk and one empty chunk. Now instead of sending the empty chunk back to the client (haproxy) Traefik resets the connection to the backend server and closes the connection to the client:

Image

Let me explain what we are seeing here:

  • 171-173 is the standard tcp handshake from Haproxy to Traefik (10.7.192.2 to 10.7.220.115)
  • 174,175 is traefik receiving the request from haproxy and ACKing it
  • 176,177,178 is traefik sending the request to the backend server (tomcat) and getting back an ACK
  • 179 is tomcat returning headers + first datachunk to traefik
  • 180 is Traefik (correctly) returning the data to HaProxy
  • 181 is Traefik receiving the final 0 chunk from Tomcat (see bottom of screenshot)
  • 182 is Traefik ACKing that data
  • 183 is Traefik RSTing the persistent connection to the backend server -- Why?
  • 184ff is the FIN/ACK dance where Traefik shuts down the connection to haproxy instead of sending the empty chunk

We found https://github.com/golang/go/issues/40747 which is closed but not fixed and could maybe (?) get fixed by EnableFullDuplex. I will try fastproxy later on today if it makes a diff.

Do you have any ideas on how to debug this further?

What did you see instead?

.

What version of Traefik are you using?

We tried: 2.11.15 2.11.24 3.4.0

What is your environment & configuration?

[providers.file]
    directory = "/etc/traefik2/conf.d"
    watch = true

[global]
    checkNewVersion = false
    sendAnonymousUsage = false

[entryPoints]
    
    [entryPoints.web]
        address =  ":80"

    [entryPoints.websecure]
        address =  ":443"

[log]
level="DEBUG"

and the dynamic config:

http:
  routers:
    server-https:
      rule: "PathPrefix(`/`)"
      service: server
      entryPoints:
        - web
        - websecure
      tls: {}
      middlewares:
        - sethsts

  services:
    server:
      loadBalancer:
        servers:
          - url: http://127.0.0.1:10990 

If applicable, please paste the log output in DEBUG level

apollo13 avatar May 19 '25 06:05 apollo13

Okay, we managed to pinpoint it further: If we configure Haproxy to send a Connection: close header then the issue does not occur. So this issue manifests itself only if the Client uses HTTP keep-alive and closes the connection via a TCP FIN dance after receiving the response. If the client send Connection: close then Traefik closes the connection not exhibiting the problem. This might be a result of traefik reading the inbound stream due to request pipelining which is allowed with HTTP/1.1?

apollo13 avatar May 19 '25 08:05 apollo13