akka-http
akka-http copied to clipboard
WebSocket stream not closed when there is a network disconnection (and while using keep-alive)
Hello everyone,
We are experiencing an issue with akka-http 10.1.5 that doesn't terminate the websocket stream when there is a network disconnection.
Our server configuration is the following:
akka{
http {
server {
idle-timeout = 60 seconds
websocket{
periodic-keep-alive-mode = ping
periodic-keep-alive-max-idle = 30 seconds
}
}
client {
idle-timeout = infinite
}
}
}
The ping/pong mechanism seems to work as expected (and documented), meaning: I see in Wireshark the data being sent from the server every 30 seconds and the relative ACK from the client back.
If I stop the client manually, the server (deployed on AWS) recognizes that there is a disconnection, and the websocket flow terminates from the server side.
If I drop the WIFI connection (where the client resides, while the server is deployed on AWS), the server does not receive any termination (or at least it's not propagated?).
Did I misunderstand how the keep-alive mechanism is supposed to work? I could bet that some weeks/months ago this mechanism (with akka-http 10.1.1) used to work, however, I am not 100% sure about it - I just remember having tried it and it used to be OK, the automatic termination happened.
I can't share any code, unfortunately, however, I might try to set up a little PoC, if needed.
Thanks in advance.
If I drop the WIFI connection (where the client resides, while the server is deployed on AWS), the server does not receive any termination (or at least it's not propagated?).
This is basically the worst case for an idle TCP connection. All of the IP-level routing still works but one of the endpoint has gone away. This can only be detected once one of the sides sends another TCP packet which will eventually run into a timeout.
It seems we only implemented the sending of PINGs but do not react when there are no PONGs coming back.
Also we should add ping/pong support to the client.
Thanks for answering.
Also we should add ping/pong support to the client.
I think that this is already included (see: https://doc.akka.io/docs/akka-http/current/configuration.html).
The issue is really when the connection drops suddenly without coming back and without giving the client the possibility to terminate the stream.
I think that this is already included (see: doc.akka.io/docs/akka-http/current/configuration.html).
Right, sorry, I missed where it was added.
In the meantime, I would like to add that in order to partially solve the issue, an idea would be to move the ping/pong to the client side, so that the server is at least able to clean up the resources (it automatically terminates the connection after akka.http.server.idle-timeout
seconds).
The issue is that the client still consider itself connected, once the WIFI drops. It would be nice to have a sort of read-timeout
and write-timeout
for websockets. Here it would really help, so that instead of checking that the pong is received for the ping (by introducing state here and there), you would simply see that no data is received for X seconds (and this would be the time frame to receive the pong back). This is similar to what traefik does.
So, after digging even deeper into this issue, another workaround (from the client perspective) might be to tweak the settings net.ipv4.tcp_retries1
and net.ipv4.tcp_retries2
in the kernel. I didn't try it, because we have another application that controls the connection/disconnection, however, I just thought it would be nice to document that there might be alternatives.
This way, the client can drop from the network and leave the TCP connection open up to X seconds (configurable, therefore predictable), leaving the control to the OS to terminate the connection. Also, if you use docker containers, please note that this is a kernel settings, and it might affect other connections on that host.
In general, it seems that supporting TCP_USER_TIMEOUT
in the socket options would be enough to solve this sort of issues. Is there any other way to add it? As far as I know, akka-http allows to set some properties like:
socket-options {
...
tcp-keep-alive = undefined
tcp-oob-inline = undefined
tcp-no-delay = undefined
}
it would be nice to support something like TCP_USER_TIMEOUT
(although it's not portable, at least it doesn't seem to exist on windows).
Do you think it's a good idea to do so?
it would be nice to support something like
TCP_USER_TIMEOUT
(although it's not portable, at least it doesn't seem to exist on windows).
It seems TCP_USER_TIMEOUT
is currently not supported by Java.
Do you think it's a good idea to do so?
I think some of these could be an option for some people. However, all of these depend on the environment. We cannot make too many assumptions about the environment, so I'd rather do whatever we can to detect these situations in our code (i.e. reacting on missing PONGs in our own code).
+1 on this, I have a AKKA WebSocket client connecting to BitMex WS which has ping/pong support but no reaction if I drop the internet on the client machine even if there are no Pong responses from bitmex
I think some of these could be an option for some people. However, all of these depend on the environment. We cannot make too many assumptions about the environment, so I'd rather do whatever we can to detect these situations in our code (i.e. reacting on missing PONGs in our own code).
I saw the native PONG doesn't come as a message in my WebSocket Client Flow. Is it possible to catch them some other way ?
Hello @jrudolph is there a plan for fixing / implementing the pong timeout handling?
It seems no one is actively working on this right now but we are happy to review contributions.