jmeter
jmeter copied to clipboard
HTTP Server closed connections keep being in CLOSED_WAIT state
Expected behavior
If the server closes an HTTP connection, jmeter should handle the closed connection immediately and not keeping it in CLOSE_WAIT state until the next request, which could also lead to "Non HTTP response" errors
Actual behavior
Currently, when JMeter does an HTTP request and is in keep-alive state and server waits for the next request. If the server closes the connection after X-seconds and sends a FIN/ACK packet, JMeter does nothing. If the next request is done within 5 seconds after the FIN/ACK packet, it sets a "Non HTTP response message: <HOST>:<PORT> failed to respond" error immediately, because the socket is already in CLOSED state and can't pick up new packets. This results in this error.
If it tries to send a request later than 5 seconds, somehow it gracefully sends a FIN/ACK back and creates a new connection and does the request.
Steps to reproduce the problem
Set keep-alive timeout to 2 seconds on the server, so it shuts down the connection after 2 seconds.
Then create a jmeter script to send request 1, then wait 3 seconds, then send request 2. It will end up with the Non HTTP response error, because it tries to send the request on a closed socket. Now wait 6 seconds rather than 3, and somehow it does work without an error.
JMeter Version
5.6.3
Java Version
17
OS Version
Mac + Ubuntu
I've done some more testing on this. It doesn't turn into a non-HTTP error against every server so I have no idea why that happens with one server I'm using, but the problem remains. On many servers somehow, after the FIN_ACK is sent and jmeter leaves it in CLOSE_WAIT state, when JMeter wants to do a new request, it somehow first sends the FIN_ACK back which cleans the connection and then does the request. Against some servers it doesn't send the FIN_ACK when it wants to do a new request, but just sends it over the connection and the server sends a RST because the connection is either cleaned up or it's in CLOSE_WAIT state.
It is a serious issue, because once the server tries to disconnect the connection (after its keep-alive timeout), the connection remains in CLOSE_WAIT state on both the webserver and jmeter client and still taking/tracking a socket which should not be there. Browsers clean up the socket immediately, so it's not there anymore on both the client and the server which is the behaviour we need to see in JMeter
I've made a setup with basic nginx, keep-alive timeout changed to 2 seconds. Then a script that does a request (with keep-alive), then 4 seconds later, an new requests. The 2nd requests fails due to this RST and connection in CLOSE_WAIT state.
It only happens with the HttpClient4 implementation. The Java implementation has no issues
So I've found the root cause of the issue and it's due to this config httpclient4.validate_after_inactivity.
This defaults to 4900 (ms), meaning that if a connection trying to be disconnected by the server within 4900ms, and a new request is done between the requested disconnect and 4900ms, it just sends the request packet over the connection in CLOSE_WAIT state, resulting in a connection reset error.
I think this is a bad implementation in the HttpClient4 library. It should simply remove a connection once the server requests it.
A work-around is to set httpclient4.validate_after_inactivity low (900). This only resolves the unexpected "Non HTTP response" errors, but doesn't resolve the high CLOSE_WAIT state taking unrealistic higher connections on jmeter clients and left FIN_WAIT_2 state on webservers
I experienced the same issue - thanks for posting here! Should a ticket be added for this? Decreasing the validation period helps but as you mentioned doesn't resolve the high CLOSE_WAIT states etc.
I experienced the same issue - thanks for posting here! Should a ticket be added for this? Decreasing the validation period helps but as you mentioned doesn't resolve the high CLOSE_WAIT states etc.
Well I created a ticket with this right? But indeed, it should actually be resolved and behave like a browser, or any http client. So it should respond to the FIN packet and close the socket on the jmeter side and send a FIN back, so the server can remove the half closed state as well
Ah you are right - I thought it is still the bugzilla tracker... But since 2022 it was migrated here. So everything is fine :-)
Great find, I've been suffering from this issue as well. From the above it sounds like an issue was also filed with Apache? Would you mind providing the link, I've been searching in the Apache Jira without success .
Great find, I've been suffering from this issue as well. From the above it sounds like an issue was also filed with Apache? Would you mind providing the link, I've been searching in the Apache Jira without success .
No, the only issue, is this issue here. But as it's a httpclient bug I think it should be reported there I guess
any update on this ticket ?
any update on this ticket ?
Sorry, after analysing and putting everything on a plate, nothing happens. Not sure how big the jmeter team is, but I've made some interesting proposals and even PRs, but nothing is happening. Last JMeter update was more than year ago, so it feels like a dead project lately.
If the server closes an HTTP connection, jmeter should handle the closed connection immediately and not keeping it in CLOSE_WAIT state until the next request, which could also lead to "Non HTTP response" errors
Well, could you please clarify why would you expect exactly this behavior? As far as I understand, It is pretty much fine for the "server" to close a connection while client could still send some data.
In other words, it means "server is done sendings its bytes", however, it is fine for a TCP connection to be in half-closed state, and the client can still send data and the server could receive it.
Currently, when JMeter does an HTTP request and is in keep-alive state and server waits for the next request. If the server closes the connection after X-seconds and sends a FIN/ACK packet, JMeter does nothing. If the next request is done within 5 seconds after the FIN/ACK packet, it sets a "Non HTTP response message: : failed to respond" error immediately, because the socket is already in CLOSED state and can't pick up new packets. This results in this error.
Well, this boils down to the expected outcome of your test.
In practice, people use JMeter to simulate actual applications. So you should configure JMeter exactly the same as the application/microservice/browser you try to impersonate.
For instance, if the application/microservice/browser does not expect the server to close a connection shortly, then the application would fall into the same issue of trying to send data over a broken connection. In other words, "Non HTTP response message: : failed to respond" surfaces a configuration error (assuming you've configured JMeter the same as your app).
If the application performs connection re-validation, you should configure JMeter to do so.
Does that make sense?
but doesn't resolve the high CLOSE_WAIT state taking unrealistic higher connections on jmeter clients and left FIN_WAIT_2 state on webservers
If your webserver closes connections immediately while JMeter trying to keep them alive, I expect it might be the following: a) It might be a true configuration bug discovered by JMeter. In other words, you will have "many CLOSE_WAIT connections" in production if clients attempt to use keepalive while the server closes the connections early b) If the clients do not use keepalive in production, you should disable keepalives in JMeter as well
WDYT?
This appears as a critical bug for my use case as well. The same test script running against the same application under test will run great with JMeter 5.5 but will run grind to halt on JMeter 5.6.3 on account of timeouts due to these CLOSED_WAIT. This, in concert with the transaction controller memory leak issue, has forced us to downgrade to JMeter 5.5.
This appears as a critical bug for my use case as well. The same test script running against the same application under test will run great with JMeter 5.5 but will run grind to halt on JMeter 5.6.3 on account of timeouts due to these CLOSED_WAIT. This, in concert with the transaction controller memory leak issue, has forced us to downgrade to JMeter 5.5.
For the transaction controller, the fix is either: don't enable 'Generate parent sample', or make a build of this PR: https://github.com/apache/jmeter/pull/6386
Regarding close_wait, it doesn't fix the server side, but it helps on the errors, to lower httpclient4.validate_after_inactivity to 900ms for example.
If the server closes an HTTP connection, jmeter should handle the closed connection immediately and not keeping it in CLOSE_WAIT state until the next request, which could also lead to "Non HTTP response" errors
Well, could you please clarify why would you expect exactly this behavior? As far as I understand, It is pretty much fine for the "server" to close a connection while client could still send some data.
In other words, it means "server is done sendings its bytes", however, it is fine for a TCP connection to be in half-closed state, and the client can still send data and the server could receive it.
First of all, it is not fine for a tcp connection to be in half-closed state. The client can't send data where the server would receive it. At least not in a FIN_WAIT_2 / CLOSE_WAIT state, because this means the server is trying to shut down the socket, but the client didn't do it yet. If the client then sends a packet as if the connection would still be open, it would get a RST which is not fine.
Currently, when JMeter does an HTTP request and is in keep-alive state and server waits for the next request. If the server closes the connection after X-seconds and sends a FIN/ACK packet, JMeter does nothing. If the next request is done within 5 seconds after the FIN/ACK packet, it sets a "Non HTTP response message: : failed to respond" error immediately, because the socket is already in CLOSED state and can't pick up new packets. This results in this error.
Well, this boils down to the expected outcome of your test.
In practice, people use JMeter to simulate actual applications. So you should configure JMeter exactly the same as the application/microservice/browser you try to impersonate.
For instance, if the application/microservice/browser does not expect the server to close a connection shortly, then the application would fall into the same issue of trying to send data over a broken connection. In other words, "Non HTTP response message: : failed to respond" surfaces a configuration error (assuming you've configured JMeter the same as your app).
If the application performs connection re-validation, you should configure JMeter to do so.
Does that make sense?
You are right with simulating the clients behaviour. In most cases, the client you are simulating is either a browser or some kind of microservice doing API-calls. Either way, whenever the other end (server) sends a FIN/ACK packet, other client should simply respond with a FIN/ACK as well, so both ends can close the socket (similar to a SYN request, the other end has to respond with SYN/ACK to accept it or not). None of the browsers at least behave in a way, that after a FIN/ACK, they do nothing. They all accept the disconnect and are not trying to sent a new request over a half-closed socket. We can lower the idle time with httpclient4.validate_after_inactivity to avoid non http errors in jmeter, but actually either way, JMeter should instant close the connection on the client and respond so the server can fully remove the socket and not keep it in CLOSE_WAIT. Maybe there are clients that behave the way jmeter behaves, because they use the httpclient4/5 lib, but that would be very rare.
but doesn't resolve the high CLOSE_WAIT state taking unrealistic higher connections on jmeter clients and left FIN_WAIT_2 state on webservers
If your webserver closes connections immediately while JMeter trying to keep them alive, I expect it might be the following: a) It might be a true configuration bug discovered by JMeter. In other words, you will have "many CLOSE_WAIT connections" in production if clients attempt to use keepalive while the server closes the connections early b) If the clients do not use keepalive in production, you should disable keepalives in JMeter as well
WDYT?
In a normal browser-webserver situation, or HTTP/1.1 default, keep-alive is the default and the server dictates when to close the connection, not the client. The client has no idea when (the keep alive time in the header could give it a hint, but it's just a hint). But when the server closes a connection, ie after 2 seconds which is what many webservers do, then we start being into an unrealistic situation already, because jmeter doesn't actively respond on the closed connection on tcp level, and keeping the TCP socket in a half open state on both ends (server and client/jmeter). This causes obvious errors as stated in this issue. So this ticket is about 2 issues:
- JMeter (httpclient) isn't properly handling the FIN/ACK immediately, causing temporary half-closed sockets (CLOSE_WAIT and FIN_WAIT_2)
- Non HTTP response message: : failed to respond errors, in the situation if jmeter tries to do the next request on this socket, after the (web)server send its FIN/ACK (disconnect) packet, but before the httpclient4.validate_after_inactivity time
because this means the server is trying to shut down the socket, but the client didn't do it yet
First, if the server was about to close the connection it should probably send Connection: close header, so the client knows the server does not want reusing the connection.
Second, I'm afraid the only way to tell if "server" closes a connection is to read/write something to the connection. That means if the server silently closes a connection (which is (un)fortunately allowed by various HTTP RFCs) the client does not get an immediate notification, thus it can't discard the connections right away.
Frankly, it is not clear what clients are supposed to do with all this. It is not clear how Java implementation handles "silent connection close"
I think this is a bad implementation in the HttpClient4 library. It should simply remove a connection once the server requests it.
Looks so, however, as the only way to detect "server-side closure" is to write data, so HttpClient4 should detect "IOException when writing headers", and retry the request. It looks like https://www.rfc-editor.org/rfc/rfc2616.html#section-8.2.4 specifies that, however, I'm not sure if that is the currently active RFC.
First of all, it is not fine for a tcp connection to be in half-closed state. The client can't send data where the server would receive it. At least not in a FIN_WAIT_2 / CLOSE_WAIT state
See https://datatracker.ietf.org/doc/html/rfc9293#name-half-closed-connections
Since the two directions of a TCP connection are closed independently, it is possible for a connection to be "half closed", i.e., closed in only one direction, and a host is permitted to continue sending data in the open direction on a half-closed connection.
Could you clarify (e.g. refer a RFC) why do you think half-closed connections are "not fine"?
In my case 1 or 2 out of approx 8000 to 10000 transaction are failing with non http response message: connection reset by peer or connection timeout. I think jmeter is closing connection before it receive complete response from server and I can see it's producing code 499 on server log. I tried setting up adding higher connection timeout, response timeout value but no luck.
because this means the server is trying to shut down the socket, but the client didn't do it yet
First, if the server was about to close the connection it should probably send
Connection: closeheader, so the client knows the server does not want reusing the connection.
No, with keep-alive (http/1.1 standards), it can disconnect whenever it wants. In many cases it can be after just 2 seconds. The connection: close is only a hint from the client to request the server to close the connection immediately after the response. If the server does a connection: keep-alive it certainly doesn't mean it would keep the connection open for unlimited time, and it will disconnect at some point (could vary between near instant, to 1, 2, or whatever seconds). When the server is disconnecting earlier than the httpclient4.validate_after_inactivity setting, then JMeter is getting into the problem-zone, because if the thread tries to do a request AFTER the server sent the disconnect, but before the validate_after_inactivity value (default 4900ms), then you'll get the non-http error (because it sends the request over a half-closed socket so the server will respond with a RST, because the FIN packet was already responded by jmeter/httpclient with an ACK).
Second, I'm afraid the only way to tell if "server" closes a connection is to read/write something to the connection. That means if the server silently closes a connection (which is (un)fortunately allowed by various HTTP RFCs) the client does not get an immediate notification, thus it can't discard the connections right away.
Sending a FIN/ACK is. I think, not a silently closed connection, but a clear message to the client that is is actively closing the connection. In fact, it gets the packet/message so there is nothing silent about it.
Frankly, it is not clear what clients are supposed to do with all this. It is not clear how Java implementation handles "silent connection close"
It is clear, it should reconnect if it want to do a new/next request. The problem is in httpclient4, where it is not actively closing the socket/responding to the FIN/ACK. It knows it is there, because if you want to do a request AFTER the httpclient4.validate_after_inactivity time, it does know the socket was requested to be closed and acts how it supposed to, but should have done it immediately when it got the FIN/ACK.
I think this is a bad implementation in the HttpClient4 library. It should simply remove a connection once the server requests it.
Looks so, however, as the only way to detect "server-side closure" is to write data, so HttpClient4 should detect "IOException when writing headers", and retry the request. It looks like https://www.rfc-editor.org/rfc/rfc2616.html#section-8.2.4 specifies that, however, I'm not sure if that is the currently active RFC.
This is the whole idea of this ticket, it doesn't do so and jmeter throws this non-http error (because httpclient sends data on the socket in half-state and receives a RST from the server).
https://datatracker.ietf.org/doc/html/rfc9293#name-closing-a-connection
In the end, we can look into standards, if it is throwing errors where it is not supposed to do, and if it is not acting as browsers/client do, there is a bug
@jgaalen , please refer to RFCs or the public documentation. Otherwise it is hard to tell where all your conclusions come from. Many parts of your messages violate or contradict RFCs and Java documentation.
the problem is in httpclient4, where it is not actively closing the socket/responding to the FIN/ACK. It knows it is there
Please double-check. The Java side does not know there's FIN. "FIN" is not exposed in Java APIs. If you know an API, please clarify which one exposes "FIN from the server".
Then, the application (e.g. Java application) can't tell if the server "died completely", "fully closed the stream" or "closed the write part of the stream only". See https://github.com/golang/go/issues/67337#issuecomment-2123284523
@jgaalen , please refer to RFCs or the public documentation. Otherwise it is hard to tell where all your conclusions come from. Many parts of your messages violate or contradict RFCs and Java documentation.
Perhaps screenshots make it more obvious what is going on. I've made a test case, with a simple nginx and a keep-alive timeout setting of 2 seconds. Then running a simple jmeter thread with 1 request, then waits 4 seconds, then does request 2. We get the following error:
This is a tcpdump which shows what is going on:
Here we can see it does the request, Then 2 seconds later, the webserver sends the FIN to close the socket. JMeter/client only responds with an ACK. At this moment, the connection is in CLOSE_WAIT state at the client (JMeter), and in FIN_WAIT_2 state at the server (nginx). After this, it does nothing. 2 seconds later (4 seconds after previous request), we can see client/jmeter tries to send the next request on the existing socket. The server (nginx) doesn't respond with an HTTP-200 or any other valid http data response, but it sends the FIN again (I've said RST before, this was false, it sends the FIN). After this, JMeter throws the Non HTTP response code error
You cannot argue this is good and expected behaviour from JMeter. It should simply not try to send the request on this half-closed socket. Even tho, it is technically allowed to keep the socket in a half-closed/open state, it should not send a new request over this as it would lead to obvious errors which are not realistic.
This screenshots shows the behaviour from a browser (firefox). You can see that 2s after the response, the server closes the connection, but immediately, firefox (client) closes the connection on its behave as well, clearing the socket on both ends.
This screenshot shows the behaviour when we wait 6 seconds rather than 4 seconds (passing the 4900ms of default httpclient4.validate_after_inactivity), we can see better behaviour. Before it tries to send the request, it first closes the connection on the client side as well (notifying the server with a FIN/ACK) and creates a new connection.
workaround is to set httpclient4.validate_after_inactivity maybe to 1ms so it always evaluates if the socket is in a half-closed state and acts how it should
the problem is in httpclient4, where it is not actively closing the socket/responding to the FIN/ACK. It knows it is there
Please double-check. The Java side does not know there's FIN. "FIN" is not exposed in Java APIs. If you know an API, please clarify which one exposes "FIN from the server".
Then, the application (e.g. Java application) can't tell if the server "died completely", "fully closed the stream" or "closed the write part of the stream only". See golang/go#67337 (comment)
If it doesn't know it received the FIN, how is it possible it sends the FIN back AFTER the httpclient4.validate_after_inactivity timeout? This is not the OS, it is the client (jmeter/java/httpclient) that does this. Shown in this screenshot: https://private-user-images.githubusercontent.com/10229665/407698447-4d0c1265-fcb9-40a3-ba8b-603a31350445.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzgxNDc5MTQsIm5iZiI6MTczODE0NzYxNCwicGF0aCI6Ii8xMDIyOTY2NS80MDc2OTg0NDctNGQwYzEyNjUtZmNiOS00MGEzLWJhOGItNjAzYTMxMzUwNDQ1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAxMjklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMTI5VDEwNDY1NFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUzYWNjNDU4OWY1ZWZiYWMzZmJjNzY5ZmU2MWU5MGYxNjQ1MTU5YTg1ODAzZjVkNWZiNmUyNmI0ODViYzA2MjImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.dLouViVu29aM-ltSrKNO6xsqdMD6cXNCgiERjI5TyAk
If it doesn't know it received the FIN, how is it possible it sends the FIN back AFTER the httpclient4.validate_after_inactivity timeout?
Httpclient validates the connection before making a request if httpclient4.validate_after_inactivity expires. What it does it attempts a read() with 1ms timeout. It closes the client's connection if it receives EOF. Connection close from the client's perspective is never automatic, nor does it receive a notification on "server's close".
If it doesn't know it received the FIN, how is it possible it sends the FIN back AFTER the httpclient4.validate_after_inactivity timeout?
Httpclient validates the connection before making a request if
httpclient4.validate_after_inactivityexpires. What it does it attempts aread()with1mstimeout. It closes the client's connection if it receives EOF. Connection close from the client's perspective is never automatic, nor does it receive a notification on "server's close".
"What it does it attempts a read() with 1ms timeout." Where does it reads specifically? It doesn't send a packet the to peer, because from what we can see from the tcpdump is that is sends a FIN itself. So it sends a FIN after the validate_after_inactivity and naively sends the request data before the validate_after_inactivity
Where does it reads specifically?
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java/org/apache/http/impl/pool/BasicConnPool.java#L92
=>
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java-deprecated/org/apache/http/impl/AbstractHttpClientConnection.java#L323
=>
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java-deprecated/org/apache/http/impl/io/SocketInputBuffer.java#L88-L99
Where does it reads specifically?
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java/org/apache/http/impl/pool/BasicConnPool.java#L92
=>
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java-deprecated/org/apache/http/impl/AbstractHttpClientConnection.java#L323
=>
https://github.com/apache/httpcomponents-core/blob/a5c117028b7c620974304636d52f06f172f1d08b/httpcore/src/main/java-deprecated/org/apache/http/impl/io/SocketInputBuffer.java#L88-L99
so setting validity setting to 1ms would at least solve the unexpected non-http errors (but not the high CLOSE_WAIT/FIN_WAIT_2) states
so setting validity setting to 1ms would at least solve the unexpected non-http errors
It might add an artificial 1+ms on every request though.
so setting validity setting to 1ms would at least solve the unexpected non-http errors
It might add an artificial
1+mson every request though.
So basically this already happens for every request happening after 4900ms of idle time on a socket?
I have to say @vlsi that I feel like this is very demotivating and I feel you're not taking this issue seriously and deflecting everything. Also on other topics, I've made some contribution to this open source project but everything ends up in void. What is the status of JMeter anyway? Last update was more than a year ago! That isn't a good sign for a tool so generically used as JMeter, so what is that about? If you wan't to keep JMeter alive and the community contributing then this is, I think, not the way to go.