Close an HTTP/1.1 connection after delay
Motivation:
- Closes https://github.com/line/armeria/issues/4849
- To suppress a server socket from remaining in the
TIME_WAITstate instead ofCLOSEDwhen a connection is closed.
Modifications:
- Remove the force shutdown mode of
AbstractHttpResponseHandlerthat is activated when a user sets theConnection: closeHTTP header - Create a schedule that waits and closes the connection to wait for the client-side connection to close, rather than closing the HTTP/1.1 connection immediately.
Result:
- Closes https://github.com/line/armeria/issues/4849
- When a user wants to close an HTTP/1.1 server connection, the armeria server first waits for the client to close the connection for a while to make the connection state to
CLOSEDinstead ofTIME_WAIT.
๐ Build Scanยฎ (commit: 596b223ae457395390a356bc20b684de119e53df)
| Job name | Status | Build Scanยฎ |
|---|---|---|
| build-windows-latest-jdk-21 | โ | https://ge.armeria.dev/s/5waqxdyew7bpw |
| build-self-hosted-unsafe-jdk-8 | โ | https://ge.armeria.dev/s/invs7rd7eogyo |
| build-self-hosted-unsafe-jdk-21-snapshot-blockhound | โ | https://ge.armeria.dev/s/nza6his2f7dhy |
| build-self-hosted-unsafe-jdk-17-min-java-17-coverage | โ | https://ge.armeria.dev/s/ncas2zwrbv67i |
| build-self-hosted-unsafe-jdk-17-min-java-11 | โ | https://ge.armeria.dev/s/qov5igwrcmcey |
| build-self-hosted-unsafe-jdk-17-leak | โ | https://ge.armeria.dev/s/nwk6m7sjehp62 |
| build-self-hosted-unsafe-jdk-11 | โ | https://ge.armeria.dev/s/ne7f4532ze7gm |
| build-macos-12-jdk-21 | โ | https://ge.armeria.dev/s/2spxi2yyxc2uk |
https://github.com/line/armeria/pull/5616#issuecomment-2060205298 Please also sign the CLA ๐
Looking into the failing test, I guess there is a case where Connection: close header isn't sent but we set needsDisconnection() somewhere.
Went through the code, and I think the issue is the following:
- The test failure occurs when
testTooLargeContentToNonExistentServiceis run, and thentestStreamingResponseWithSlowClientis run with HTTP1.1. The two use the sameClientFactory, and hence reuse connections.- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/test/java/com/linecorp/armeria/server/HttpServerStreamingTest.java#L161
- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/test/java/com/linecorp/armeria/server/HttpServerStreamingTest.java#L220
- When
testTooLargeContentToNonExistentServiceis run, a404is returned from theFallbackService. This response is subsequently written to the wire without rescheduling.- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/main/java/com/linecorp/armeria/server/FallbackService.java#L46
- However, afterwards the request is continuously consumed and reaches
maxRequestLength. At this point, we setdisconnectWhenFinishedalthough the response has already been written. Hence, we don't have a chance to send aConnection: closeheader to the client.- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/main/java/com/linecorp/armeria/server/Http1RequestDecoder.java#L324-L324
- Since we don't close the connection immediately, the client believes the connection is healthy and tries to reuse this connection. However, the
discardingflag is already set and ensuing requests are silently ignored- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/main/java/com/linecorp/armeria/server/Http1RequestDecoder.java#L320
- Lastly, the idle timeout kicks in which fails the test
Overall, I think it's probably fine to just close the connection immediately if we can't write the 413 response headers. This isn't a perfect solution, but at least preserves backwards compatibility.
- We can check
encoder.isResponseHeadersSent(id, 1)at the following location, and close the channel immediately if the response headers have already been written.- https://github.com/line/armeria/blob/0a5a3d7a2f74f51a4893270a6f4aeac5f7e3c575/core/src/main/java/com/linecorp/armeria/server/Http1RequestDecoder.java#L324-L325
- Otherwise the 413 with
Connection: closewill be written, so we can preserve the previous behavior.
https://github.com/line/armeria/pull/5616#issuecomment-2084886389
When also I analysed the reason for the failure of this test, I found that the response is sent immediately at 404. And when the content size exceeds maxRequestLength while reading the rest of the request content, the Connetion: close header is not sent and the server closes the connection. The problem was that the connection discarding flag was set and the will be closed soon was reused in the next test. Because this PR does not check if a response has already been sent, but just waits for the set time before closing the connection, the client didn't know that the connection was about to be closed and reused it causing the connection to be closed during the test.
To solve this, I was trying to figure out how to make it fail fast, as I think the correct behaviour is to ignore the content if it's a 404.
If I knew the length of the content, it shouldn't be a problem. But I thought there might be an issue with using chunked transfer encoding.
I also wondered if it would be better to just set discarding to true and read the request to the end and send the response, since even in the 404, the server reads all the content sent by the client anyway.
When I thought about the method you advised, it might be a problem when it takes a long time to reach the content max length so the same phenomenon can occur like in the test. It would look like the connection is suddenly closed from the client's point of view.
However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.
Thanks for your advice ๐
However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.
Thanks! Let us know if you would like to try handling this as a follow-up issue.
However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.
Thanks! Let us know if you would like to try handling this as a follow-up issue.
Yes, I would like to!