Unknown reason triggers .doOnCancel(() -> cleanup(exchange)) in NettyWriteResponseFilter occasionally
Describe the bug Spring Cloud Gateway version:2.2.5 reactor-netty version :0.9.15
- Use method
ServerHttpResponse.writeWith(Mono<DataBuffer>instead ofServerHttpResponse.writeWith(Flux<DataBuffer>)inNettyWriteResponseFilter.javain high-concurrency scenarios triggersdoOnCancel(() -> cleanup(connection))occasionally - Method
doOnCancel(() -> cleanup(connection))will close the long connection between the gateway and downstream services, and this closure will not be monitored by the connection pool status, resulting in subsequent requests generating exceptionreactor.netty.channel.AbortedException: Connection has been closed BEFORE response while sending request body
Describe the solution you'd like
- Understanding the root cause of mono triggering doOnCancel
- Connection pool can monitor connection status when the connection is closed
Sample
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
return chain.filter(exchange)
.doOnError(throwable -> cleanup(exchange))
.then(Mono.defer(() -> {
Connection connection = exchange.getAttribute(CLIENT_RESPONSE_CONN_ATTR);
if (connection == null) {
return Mono.empty();
}
ServerHttpResponse response = exchange.getResponse();
final Flux<DataBuffer> body = connection
.inbound()
.receive()
.retain()
.map(byteBuf -> wrap(byteBuf, response));
// My changes are here
Mono<DataBuffer> newBody = body.single();
MediaType contentType = null;
try {
contentType = response.getHeaders().getContentType();
}
catch (Exception e) {}
return (isStreamingMediaType(contentType)
? response.writeAndFlushWith(body.map(Flux::just))
: response.writeWith(newBody));
})).doOnCancel(() -> cleanup(exchange));
}
Is this still an issue with the supported version of spring cloud 4.1.1?
Is this still an issue with the supported version of spring cloud 4.1.1?
yes, the issue still exists.
Please mainly focus on the impact of my changes to the source code NettyWriteResponseFilter.
I added comments to the sample code.
Can you tell me how to reproduce it?
Test topology:
- Jmeter (Constructing HTTP post requests,100 qps is enough)->
- Spring Cloud Gateway( version 4.1.1 can reproduce and make the modifications according to my previous comments in
NettyWriteResponseFilter)-> - DownStream Services(a netty http server Or a springboot server)
Notes:
- Jmeter runs on Windows10 ,The number of threads in the testing group should be greater than 1
- Every other service runs on a 2-core 4GB Ubuntu virtual machine
- If the number of errors occurring is positively correlated with QPS, then it represents a reproduction
- the errors is
Connection has been closed BEFORE response, while sending request body,Or You can supplement the logs indoOnCancel(() -> cleanup(exchange)),which also represents a reproduction
The issue I asked is related to this one? https://github.com/reactor/reactor-netty/issues/741
Does the question subject have a solution? I have also been troubled by this issue in my project for a long time..