stompjs icon indicating copy to clipboard operation
stompjs copied to clipboard

[Documentation improvements] HeartBeats stops working when the tabs goes inactive on Chrome 88+

Open anthonyraymond opened this issue 4 years ago • 38 comments

Hello,

I've spent an hour trying to debug a non-existing but today between your lib and a stomp server. To prevent other from doing so here is what is going on.

Problem

I have a (15,15) heartbeat set in my app. When my tab stays inactive for some time (+-5min) the Stomp connection is eventually closed for no reasons.

Reason

HeartBeats stops working after some times In chrome (and all Chromium forks), when a tab respect some criterias it is considered inactive and all his timers are throttled.

The throttling used to be 1 action per second, but recently, with chrome 88 and above a stricter set of criterias also applies, and if they match the throttle is now one action per minute.

It's not something that can be fixed IMO, but i can be handy to have this warning in the project documentation.

anthonyraymond avatar Mar 21 '21 22:03 anthonyraymond

This indeed is an issue. Thanks for your detailed notes including links to appropriate documentation.

I agree that it is not fixable by this library. It is understandable why browsers are taking this approach, can't blame them :smile:

However, it will be useful if we can find settings that work. I have the following thinking:

  • The outgoing ping uses and relies on a block linked to the timer to be executed.
  • Incoming ping, however, works differently. It uses a timer block to check if there was no ping received within a window. If the ping from the server is actually received within the window the block linked to the timer need not be executed.
  • So, we might have a solution - if we disable outgoing ping (by setting the period to 0).

You should try and see if it makes a difference. Meanwhile, I will also test and see if that works.

kum-deepak avatar Mar 22 '21 12:03 kum-deepak

Disabling outgoing ping might indeed work, relying on the server sending the heartbeat should be enough to prevent the websocket being closed.

There might be another solution (which is ugly):

  • Since the receiving timer keeps working, we could send a PING as soon as we receive a PONG from the server. Most of the time incomming and outgoing uses the same HeartBeat rate. Most of the time it will solve the problemn but that makes the Api highly non-intuitive IMO....

having a 0 outgoing ping look like a way better idea to me, but it should be sepcified in the documentation.

anthonyraymond avatar Mar 22 '21 13:03 anthonyraymond

We are facing the same problem even we set the outgoing to 0. We experiencing websocket disconnects around every minute. Do you have any workaround suggestions?

alana1 avatar Apr 28 '21 18:04 alana1

Your problem and the one describe in this issues are two différents topics, you get random disconnection every minutes or so. This topic is about chrome shutting down timeouts and intervals when the tab goes inactive.

It would be hard to help you without any background, are you using this lib as a client or a server? what is the the STMP client / serveur you use, what parameter are you using, is there a reverse proxy in between?

anthonyraymond avatar Apr 28 '21 19:04 anthonyraymond

Thank you for your response. We are using the stomp lib as a client. The root issue is actually the one reported in this topic here. As reported in this topic, when tab is inactive for ~5mins, the stomp connection is closed. As you have suggestion on March 22 to set the outgoing hearbeat to 0, we are getting ~1mins disconnects. However, we set it back to 10sec (outgoing and incoming), the stomp connection is closed when tab is inactive (as reported here). I hope that make sense.

alana1 avatar Apr 28 '21 21:04 alana1

i've not experimented with the HB outgoing = 0 myself. I'm an Hard believer in HeartBeat, it solves problem at multiple levels (reverse proxy closing inactive WS conn on his own, dead connection detection, and so on).

Having that said the 1min timeout seems weird, do you know which side is closing the conn? is it the server or the client?

anthonyraymond avatar Apr 28 '21 21:04 anthonyraymond

We are facing the same problem even we set the outgoing to 0. We experiencing websocket disconnects around every minute. Do you have any workaround suggestions?

Recent versions of RabbitMQ seem to have a bug. Which broker are you using?

kum-deepak avatar Apr 29 '21 01:04 kum-deepak

i've not experimented with the HB outgoing = 0 myself. I'm an Hard believer in HeartBeat, it solves problem at multiple levels (reverse proxy closing inactive WS conn on his own, dead connection detection, and so on).

Having that said the 1min timeout seems weird, do you know which side is closing the conn? is it the server or the client?

Client side is closing the connection.

alana1 avatar Apr 29 '21 16:04 alana1

We are facing the same problem even we set the outgoing to 0. We experiencing websocket disconnects around every minute. Do you have any workaround suggestions?

Recent versions of RabbitMQ seem to have a bug. Which broker are you using?

We tested it with version 3.7.7

alana1 avatar Apr 29 '21 17:04 alana1

In my tests it works for 3.6.x, fails for 3.7.x and 3.8.x.

I raised it the RabbitMQ user group, please see https://groups.google.com/g/rabbitmq-users/c/HKDpmrZpxkU/m/FdyZb3HoBAAJ for details.

kum-deepak avatar Apr 29 '21 17:04 kum-deepak

Thank you for sharing the information. I did noticed that there is RabbitMQ heartbeat configuration that is default to 60s. This configuration appears to only apply to connections using AMQP protocol. I do see the hearbeat = 60 in the RMQ management UI connection information. However, I don't see heartbeat for connections with WebStomp protocols.

alana1 avatar Apr 29 '21 18:04 alana1

For info, I have managed to prevent disconnection by setting heartbeats to 60s, 60s (min time setInterval will be served by Chrome when tab is idle), but I had to increase default web socket timeout on RMQ to 120s (default is 60s) to prevent closure from server-side. web_stomp.ws_opts.idle_timeout = 120000

rad-pat avatar May 10 '21 14:05 rad-pat

Many thanks!

This indeed is interesting. I was not aware of web_stomp.ws_opts.idle_timeout option. As per the protocol, heartbeats are optional. So, setting this timeout to a very large value (say a day), should allow a connection to survive without heartbeats.

The heartbeats may still be desirable to detect a stale connection. It will be worth testing enabling only server-initiated heart beats along with a high value of web_stomp.ws_opts.idle_timeout. To be explicit no client-side heartbeats that depend on timers.

Based on the findings I will update the documents and add an FAQ entry.

kum-deepak avatar May 10 '21 15:05 kum-deepak

Hello everyone :) is this Issue still beeing worked on?

I still have this issue with the same setup as described above: RabbitMQ 3.7.x using the Stomp plugin, SpringBoot and Angular frontend. Only tabs that are not focused are affected.

Reconfiguring the Rabbit did nothing since the connection is restarted by the client. Server side heartbeats are received just fine. I am actually not even certain that the heartbeats are the reason for the reconnect since the reconnect happens always in 1 minute cycles even when the heartbeat frequenzy is set to serveral minutes.

MissingConnections

GJohannes avatar Jul 12 '21 15:07 GJohannes

Hello, @GJohannes this is not a bug that can be fixed, it's a Chrome wanted behaviour, we can't fight against the browser in this case.

anthonyraymond avatar Jul 12 '21 19:07 anthonyraymond

@anthonyraymond Thank you for your fast response. Kind of suspected it but thank you for the clarification.

GJohannes avatar Jul 13 '21 07:07 GJohannes

For info, I have managed to prevent disconnection by setting heartbeats to 60s, 60s (min time setInterval will be served by Chrome when tab is idle), but I had to increase default web socket timeout on RMQ to 120s (default is 60s) to prevent closure from server-side. web_stomp.ws_opts.idle_timeout = 120000

Hello, thank you for your solution. It worked for me too. But it also solved my problem when I set the heartbeat to zero. A question arises in my mind here. Will setting Heartbeat to zero cause me a problem? I tried many cases and it didn't give me any problems. For example; When the server is shut down due to any problem, disconnect message appears on the client. So what does this heartbeat do? What will I lose when I set the heart rate to zero? I would be very happy if you could help, thank you.

@rad-pat @kum-deepak

omercelikceng avatar Jul 30 '21 21:07 omercelikceng

Not sure if it helps anyone, but at Volvo we solved this by adding listeners to document.hidden and manually severing the connection when it becomes true. We then synch any missing data when visibility is restored, and establish a new connection.

csvan avatar Jul 31 '21 15:07 csvan

Not sure if it helps anyone, but at Volvo we solved this by adding listeners to document.hidden and manually severing the connection when it becomes true. We then synch any missing data when visibility is restored, and establish a new connection.

Ok, I'll try that too. Thanks. I solved my problem by setting the heartbeat to zero. But what I'm wondering is if I set the heart rate to zero will there be a problem? I couldn't find any problem. E.g; When the server is shut down for any problem, a disconnect message appears on the client. Then why is there a heartbeat? Can you please explain?

omercelikceng avatar Jul 31 '21 16:07 omercelikceng

Not sure if it helps anyone, but at Volvo we solved this by adding listeners to document.hidden and manually severing the connection when it becomes true. We then synch any missing data when visibility is restored, and establish a new connection.

Ok, I'll try that too. Thanks. I solved my problem by setting the heartbeat to zero. But what I'm wondering is if I set the heart rate to zero will there be a problem? I couldn't find any problem. E.g; When the server is shut down for any problem, a disconnect message appears on the client. Then why is there a heartbeat? Can you please explain?

Never tried that, but I believe the biggest potential problem is that you won't know there is a connection issue until you actually try to send/consume a message and everything crashes.

csvan avatar Jul 31 '21 17:07 csvan

Not sure if it helps anyone, but at Volvo we solved this by adding listeners to document.hidden and manually severing the connection when it becomes true. We then synch any missing data when visibility is restored, and establish a new connection.

Ok, I'll try that too. Thanks. I solved my problem by setting the heartbeat to zero. But what I'm wondering is if I set the heart rate to zero will there be a problem? I couldn't find any problem. E.g; When the server is shut down for any problem, a disconnect message appears on the client. Then why is there a heartbeat? Can you please explain?

Never tried that, but I believe the biggest potential problem is that you won't know there is a connection issue until you actually try to send/consume a message and everything crashes.

I'm so sorry, I don't know as much as you. I don't notice potential future problems. Simply setting the heartbeat to zero fixed my problem and didn't cause any other problems. I began to question why the heartbeat was needed. Sample ; Although I did not produce or consume any data, I was informed(Disconnect message) that the server had crashed. It could be rabbitmq that does this. I just wanted to understand by asking. Thank you very much for helping.

omercelikceng avatar Jul 31 '21 17:07 omercelikceng

Ok, I'll try that too. Thanks. I solved my problem by setting the heartbeat to zero. But what I'm wondering is if I set the heart rate to zero will there be a problem? I couldn't find any problem. E.g; When the server is shut down for any problem, a disconnect message appears on the client. Then why is there a heartbeat? Can you please explain?

HeathBeat is used to let the server and client know that there is still someone at the other side of the pipe. If in the middle you introduce a reverse proxy (nginx, traefik, ...) you will start to encounter problems. Reverse proxy apport from being reverse proxy also introduce out of the box some neat network optimisations, and most of the time they do have a default "close the tcp connection if nothing goes through". This is the main problem you will encounter IMO (and you won't now that until it goes in production 😄)

anthonyraymond avatar Jul 31 '21 23:07 anthonyraymond

Ok, I'll try that too. Thanks. I solved my problem by setting the heartbeat to zero. But what I'm wondering is if I set the heart rate to zero will there be a problem? I couldn't find any problem. E.g; When the server is shut down for any problem, a disconnect message appears on the client. Then why is there a heartbeat? Can you please explain?

HeathBeat is used to let the server and client know that there is still someone at the other side of the pipe. If in the middle you introduce a reverse proxy (nginx, traefik, ...) you will start to encounter problems. Reverse proxy apport from being reverse proxy also introduce out of the box some neat network optimisations, and most of the time they do have a default "close the tcp connection if nothing goes through". This is the main problem you will encounter IMO (and you won't now that until it goes in production 😄)

I also use traefik. And I didn't know it worked like that. I understood properly now. Thank you so much.

omercelikceng avatar Aug 01 '21 07:08 omercelikceng

Some additional information on Heartbeats - when needed / not needed.

In the underlying TCP protocol, connections may survive any length of time even when no data is exchanged. If the client or the server disconnects the other side is intimated.

Let us consider two machines (say like servers) that will be always on, can hold the connection without needing a heartbeat. If, either the client or the server process terminates (graceful or ungrateful), the other side will get intimated. If the client or server machine reboots gracefully the other side will get intimated. If either of the OSs crash the other side will not get intimated. If either of these loses network connection the other side will not get intimated.

There is an interesting case here, in favour of not having heartbeats. Consider that the network connection between the machines breaks and then recovers. As long as in that period no communication attempt happens the connection will survive.

Now, let us consider an additional scenario - the client machine is a laptop. The user may simply close the lid, in such a case server will not get notified. To complicate it further, the laptop can be opened and then connect to a different WiFi network. In such a case the client will get an error when it tries to communicate. The server may not even realize that the client is no longer connected.

Based on your particular situation, you may decide the frequencies. In some cases, I keep even 5 minutes. From server to server, I sometimes go without any heartbeats. In usual web applications - 10 seconds to 120 seconds.

kum-deepak avatar Aug 01 '21 07:08 kum-deepak

Some additional information on Heartbeats - when needed / not needed.

In the underlying TCP protocol, connections may survive any length of time even when no data is exchanged. If the client or the server disconnects the other side is intimated.

Let us consider two machines (say like servers) that will be always on, can hold the connection without needing a heartbeat. If, either the client or the server process terminates (graceful or ungrateful), the other side will get intimated. If the client or server machine reboots gracefully the other side will get intimated. If either of the OSs crash the other side will not get intimated. If either of these loses network connection the other side will not get intimated.

There is an interesting case here, in favour of not having heartbeats. Consider that the network connection between the machines breaks and then recovers. As long as in that period no communication attempt happens the connection will survive.

Now, let us consider an additional scenario - the client machine is a laptop. The user may simply close the lid, in such a case server will not get notified. To complicate it further, the laptop can be opened and then connect to a different WiFi network. In such a case the client will get an error when it tries to communicate. The server may not even realize that the client is no longer connected.

Based on your particular situation, you may decide the frequencies. In some cases, I keep even 5 minutes. From server to server, I sometimes go without any heartbeats. In usual web applications - 10 seconds to 120 seconds.

You explained it really well and very clearly. Thank you so much. I understood all the problems with the network. Again, I have a question due to my lack of knowledge. How can it guarantee that it will send a disconnect message when the server is shutting down? I'm not saying the operating system crashes. Let's say my service crashed due to out of memory. Can we definitely say that we can send a disconnect message? Thank you very much again, I understand very well.

omercelikceng avatar Aug 01 '21 08:08 omercelikceng

On behalf of a process, TCP connections are managed by the OS. When a process terminates (graceful or ungrateful) the OS knows and it will close all the open connections. This implies the client will know that the TCP connection is broken and a WebSocket close event will occur. This will not be a graceful STOMP shutdown. This library, on Web Socket close, will reschedule a reconnect. Depending on the actual circumstances a WebSocket error event may be raised as well.

kum-deepak avatar Aug 01 '21 08:08 kum-deepak

On behalf of a process, TCP connections are managed by the OS. When a process terminates (graceful or ungrateful) the OS knows and it will close all the open connections. This implies the client will know that the TCP connection is broken and a WebSocket close event will occur. This will not be a graceful STOMP shutdown. This library, on Web Socket close, will reschedule a reconnect. Depending on the actual circumstances a WebSocket error event may be raised as well.

You are really awesome. And you explain it very well. I hope one day I can be as knowledgeable as you.. Thanks.

omercelikceng avatar Aug 01 '21 08:08 omercelikceng

Thank you @anthonyraymond for creating this issue and documenting your findings. We had the same disconnects for inactive tabs on chrome 88+ with stomp-js and Spring Boot 2.x websockets. But as mentioned here in the comments, a heartbeat configuration with 60s/60s prevents the client from being disconnected.

planschmu avatar Aug 05 '21 12:08 planschmu

@planschmu 60/60 should do the trick. If you use a reverse proxy you'll have to change the default config at this level too. Because most of the reverse proxy i know forfully close a TCP socket not used for more than 30s

anthonyraymond avatar Aug 05 '21 15:08 anthonyraymond

@planschmu60/60 应该可以解决问题。如果您使用反向代理,您也必须在此级别更改默认配置。因为我知道的大多数反向代理都会强制关闭超过 30 秒未使用的 TCP 套接字

I recently encountered this problem, may I ask whether to set 60/60s on the server side or 60/60s on the client side?

CG-Lin avatar May 13 '23 09:05 CG-Lin