spring-framework icon indicating copy to clipboard operation
spring-framework copied to clipboard

RestClient hangs when Destination closes Connection during Request Body Upload

Open mspiess opened this issue 8 months ago • 2 comments

The client continues to block despite the receiving server having closed the connection already. Reproduced on Spring Boot version 3.4.4 Works with 3.3.10, likely because that doesn't use io.projectreactor.netty:reactor-netty-http. Sample application reproducing the bug: https://github.com/mspiess/http-client-bug-reproduction Find the necessary steps in that repository's README.md. Excuse the convoluted setup, but I have been unable to reproduce it outside of docker.

The deprecated exchangeTimeout on ReactorClientHttpRequestFactory can be used to force a timeout, but the connection is still leaked. The other timeout options seem to have no effect.

mspiess avatar Apr 02 '25 16:04 mspiess

Possibly related to https://github.com/spring-projects/spring-framework/issues/34178

mspiess avatar Apr 03 '25 13:04 mspiess

The issue remains on Spring Boot version 3.5.0. Setting spring.http.reactiveclient.connector=jdk does not fix it.

mspiess avatar Jun 03 '25 12:06 mspiess

Hello @mspiess and thanks for reaching out.

Indeed, this is a complex issue that is hard to reproduce. I couldn't reproduce it on my local machine (macOS) but successfully saw this behavior with the docker container approach.

I don't think this issue belongs to Spring Framework and I'm going to close it. Let me explain. I have tried a few things to narrow down the problem:

Configuring the JDK client as a connector

First, this issue only happens with the Reactor Netty connector. Setting spring.http.reactiveclient.connector=jdk does not change the behavior because your application is not using the RestClient.Builder auto-configured by Spring Boot, but rather a locally built client which relies on classpath detection, so in this case Reactor is always chosen.

Manually opting in for the JDK client works:

private val restClient: RestClient = RestClient.builder()
        .requestFactory(JdkClientHttpRequestFactory())
        .build()

This yields:

java.io.EOFException: EOF reached while reading
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver$Http1TubeSubscriber.onComplete(Http1AsyncReceiver.java:601) ~[java.net.http:na]
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadSubscription.signalCompletion(SocketTube.java:648) ~[java.net.http:na]
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription.read(SocketTube.java:853) ~[java.net.http:na]

Changing the amount of data written out

If we're writing less data (but still going over the configured proxy limit):

        val obj = object {
            val str = "ABCDEFGHIJ".repeat(50_500)
        }

The Reactor client successfully sees the closed connection:

2025-08-25 17:15:23.402 | |00000780| 41 42 43 44 45 46 47 48 49 4a 41 42 43 44 45 46 |ABCDEFGHIJABCDEF|
2025-08-25 17:15:23.402 | |00000790| 47 48 49 4a 41 42 43 44 45 46 47 48 49 4a 41 42 |GHIJABCDEFGHIJAB|
2025-08-25 17:15:23.402 | |000007a0| 43 44 45 46 47 48 49 4a 41 42 43 44 45 46 47 48 |CDEFGHIJABCDEFGH|
2025-08-25 17:15:23.402 | |000007b0| 49 4a 41 42 43 44 45 46 47 48 49 4a 41 42 43 44 |IJABCDEFGHIJABCD|
2025-08-25 17:15:23.402 | |000007c0| 45 46 47 48 49 4a 41 42 43 44 45 46 47 48 49 4a |EFGHIJABCDEFGHIJ|
2025-08-25 17:15:23.402 | |000007d0| 22 7d                                           |"}              |
2025-08-25 17:15:23.402 | +--------+-------------------------------------------------+----------------+
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] WRITE: 2B
2025-08-25 17:15:23.402 |          +-------------------------------------------------+
2025-08-25 17:15:23.402 |          |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
2025-08-25 17:15:23.402 | +--------+-------------------------------------------------+----------------+
2025-08-25 17:15:23.402 | |00000000| 0d 0a                                           |..              |
2025-08-25 17:15:23.402 | +--------+-------------------------------------------------+----------------+
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.399Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.400Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.401Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.401Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.401Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.402 | 2025-08-25T15:15:23.401Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.405 | 2025-08-25T15:15:23.405Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] WRITABILITY CHANGED
2025-08-25 17:15:23.406 | 2025-08-25T15:15:23.406Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] WRITE: 5B
2025-08-25 17:15:23.406 |          +-------------------------------------------------+
2025-08-25 17:15:23.406 |          |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
2025-08-25 17:15:23.406 | +--------+-------------------------------------------------+----------------+
2025-08-25 17:15:23.406 | |00000000| 30 0d 0a 0d 0a                                  |0....           |
2025-08-25 17:15:23.406 | +--------+-------------------------------------------------+----------------+
2025-08-25 17:15:23.406 | 2025-08-25T15:15:23.406Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] FLUSH
2025-08-25 17:15:23.407 | 2025-08-25T15:15:23.406Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 - R:toxiproxy/172.19.0.2:80] READ COMPLETE
2025-08-25 17:15:23.409 | 2025-08-25T15:15:23.409Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 ! R:toxiproxy/172.19.0.2:80] INACTIVE
2025-08-25 17:15:23.412 | 2025-08-25T15:15:23.411Z  WARN 1 --- [demo1] [ctor-http-nio-2] r.netty.http.client.HttpClientConnect    : [39ebc2bf-1, L:/172.19.0.3:45288 ! R:toxiproxy/172.19.0.2:80] The connection observed an error
2025-08-25 17:15:23.412 | 
2025-08-25 17:15:23.412 | reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
2025-08-25 17:15:23.412 | 
2025-08-25 17:15:23.413 | 2025-08-25T15:15:23.413Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [39ebc2bf-1, L:/172.19.0.3:45288 ! R:toxiproxy/172.19.0.2:80] UNREGISTERED
2025-08-25 17:15:23.417 | 2025-08-25T15:15:23.416Z ERROR 1 --- [demo1] [nio-8080-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed: org.springframework.web.client.ResourceAccessException: I/O error on POST request for "http://toxiproxy": Connection prematurely closed BEFORE response] with root cause
2025-08-25 17:15:23.417 | 
2025-08-25 17:15:23.417 | reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
2025-08-25 17:15:23.417 | 

Using the Reactor Netty's HttpClient directly

I have tried using the Reactor Netty HttpClient directly but couldn't not reproduce. I think this is likely to be a concurrency issue and changing the reactive pipeline can lead to subtle changes.

Configuring wiretap debugging

I have configured the wiretap debugging for the client and saw:

|00001f00| 47 48 49 4a 41 42 43 44 45 46 47 48 49 4a 41 42 |GHIJABCDEFGHIJAB|
|00001f10| 43 44 45 46 47 48 49 4a 41 42 43 44 45 46 47 48 |CDEFGHIJABCDEFGH|
|00001f20| 49 4a 41 42 43 44 45 46 47 48 49 4a 41 42 43 44 |IJABCDEFGHIJABCD|
|00001f30| 45 46 47 48 49 4a 41 42 43 44 45 46 47 48 49 4a |EFGHIJABCDEFGHIJ|
+--------+-------------------------------------------------+----------------+
2025-08-25T15:18:48.174Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [acb84f12-1, L:/172.19.0.3:52380 - R:toxiproxy/172.19.0.2:80] WRITE: 2B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 0d 0a                                           |..              |
+--------+-------------------------------------------------+----------------+
2025-08-25T15:18:48.174Z DEBUG 1 --- [demo1] [ctor-http-nio-2] reactor.netty.http.client.HttpClient     : [acb84f12-1, L:/172.19.0.3:52380 - R:toxiproxy/172.19.0.2:80] FLUSH

The client does not raise any PrematureCloseException, which means that the issue is probably in Reactor Netty.

I have submitted a PR to your repro project to reflect my changes.

Can you create a new issue in https://github.com/reactor/reactor-netty/issues please?

bclozel avatar Aug 25 '25 15:08 bclozel

See https://github.com/reactor/reactor-netty/issues/3889

bclozel avatar Sep 19 '25 09:09 bclozel

Hello @bclozel

Appearently Spring is blocking the Event loop. Can we reopen this issue?

mspiess avatar Oct 03 '25 23:10 mspiess

After discussing this issue with @rstoyanchev , here's our current understanding of the problem.

Our ReactorClientHttpRequest is using Reactor's client in a blocking fashion by adapting an outputstream to a Publisher with org.springframework.http.client.OutputStreamPublisher. At various times, the writing to the outpustream can be suspended if there is not enough demand for data. This is done by awaiting and parking on the current thread.

Right now, the OutputStreamPublisher needs a dedicated Executor for that and the ReactorClientHttpRequest fetches the executor from the current Netty event loop. In general, any type of blocking on the event loop is a problem. What happens here is that Reactor Netty can pre-fetch data data, write and flush in ways blocking the event loop at the wrong time can lead to an invalid state where Reactor Netty is not given a chance to report back about the state of the connection.

What we should do here is perform wait operations on a dedicated thread: not an event loop thread, nor the current thread (could be the Servlet container thread). I have locally modified the code to use a custom reactor Scheduler in OutputStreamPublisher and use a Schedulers.boundedElastic() as an experiment. With that change, the problem goes away.

Now we should carefully consider the Scheduler/Executor to be used here. A bounded elastic one works, but is not necessarily the best choice. On a virtual threads compatible setup, a single thread executor with a virtual thread factory would be a good choice. On any other setup, a shared, bounded scheduler would work.

We need to discuss this further in the team.

bclozel avatar Oct 06 '25 16:10 bclozel

https://github.com/spring-cloud/spring-cloud-gateway/issues/3901 looks to be another instance of the same issue.

rstoyanchev avatar Nov 06 '25 13:11 rstoyanchev

@rstoyanchev can this be prioratized in upcoming release ? as there is no easy workaround to handle this until we have the fix

AkashB23 avatar Nov 10 '25 06:11 AkashB23

Fixed via ba39385cced0ac3981f9e320b38fb09bbe4a82d3

github-actions[bot] avatar Nov 10 '25 15:11 github-actions[bot]