flow
flow copied to clipboard
Resynchronizing UI by client's request
Description of the bug
I am often getting the message in the log of Resynchronizing UI by client's request.
The full message is: Resynchronizing UI by client's request. The network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
It's not connection, apparently it's session related.
I have on the same linux server running another Vaadin 8 application. They have different names, different ports, in different folders and are NGINX-mapped with different subdomains.
I have already evaluated the other problems related to the topic: #12640 There was no clear solution, someone raised some possibilities:
- Browser, it's not in my case, it wouldn't happen so often.
- HTTP proxy, I use NGINX for the subdomains, but I use other services like NODE, static HTML and never had a problem. NGINX was configured by default, without any additional configuration, it just throws the subdomain to port X.
- Possibility of mixing sessions: it makes no sense, since I have very little load, it happens even with only 1 user logged in.
#12173 In this problem the user uses long duration push. Not my case, I use simple, default @Push.
#11645 In this problem as I understand it was the slow connection. It's not my case, everything is flying here.
#12173 In this problem the user uses long duration push. Not my case, I use simple, default @Push.
#10096 This is a very similar scenario. But there was no conclusion, the user closed without informing how the issue was resolved and if it was resolved.
#9399 This one he solved by changing the server, it is difficult to assess what the problem was
Anyway, this problem is quite recurrent and should be better explained in the documentation. I downloaded the example available from the site, nothing out of the ordinary, little or no extra configuration.
Expected behavior
The expectation is that it doesn't lock the user's screen, it's terrible to have to ask him to refresh the page, because after it's broken it doesn't come back.
Minimal reproducible example
It's hard to simulate, because it doesn't always happen. The impression is that it happens after a while without changes on the page, but sometimes it happens right after logging in or during some slower operation.
Versions
-
Vaadin / Flow version: 23.2.0.alpha1
-
Java version: openjdk version "11.0.3" 2019-04-16 OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.10.1) OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.10.1, mixed mode, sharing)
-
OS version: -Ubuntu
-
Browser version (if applicable): Chrome
Difficult. I increased the timeout on NGINX to 600 seconds and the problem continues.
Same on Vaadin version 14.8.14
WARN com.vaadin.flow.server.communication.ServerRpcHandler [http-nio-8080-exec-3] Resynchronizing UI by client's request. Under normal operations this should not happen and may indicate a bug in Vaadin platform. If you see this message regularly please open a bug report at https://github.com/vaadin/flow/issues
tiagomartins91
Apparently no one from Vaadin is watching here.. Let's try to find the problem ourselves, try to see what we have in common. 1 - Do you have any other Vaadin application running on the same server? A: I do, but it's another folder, another version, nothing shared.
2 - Does it happen in development, when running with Intelij or similar? A: No.
3 - Does it happen in production? A: Yes, I stop the service and put a new version, first login most of the time happens, which disproves the theory that it's because of time without moving.
I was hoping to be a conflict between two applications. But it is not. I completely stopped the other application and started the new one, Vaadin 23.
And the same problem happened: Resynchronizing UI by client's request. The network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
This is very bad, it is a very serious problem.
I would suggest to post your nginx and push configuration.
Sugiro postar sua configuração nginx e push.
Push detault:
@Theme(value = "myapp")
@PWA(name = "upCampo", shortName = "upCampo")
@NpmPackage(value = "line-awesome", version = "1.3.0")
@Push
NGINX file:
server {
server_name novoportal.MYWEBSITE;
location / {
proxy_pass http://127.0.0.1:1628;
}
listen [::]:443 ssl; # managed by Certbot
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/novoportal.MYWEBSITE/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/novoportal.MYWEBSITE/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
if ($host = novoportal.MYWEBSITE) {
return 301 https://$host$request_uri;
} # managed by Certbot
listen 80;
listen [::]:80;
server_name novoportal.MYWEBSITE;
return 404; # managed by Certbot
proxy_read_timeout 600;
proxy_connect_timeout 600;
proxy_send_timeout 600;
}
About the timeouts in NGINX, I added more time, it didn't make any difference with or without this part of the code:
proxy_read_timeout 600;
proxy_connect_timeout 600;
proxy_send_timeout 600;
I don't see anything related to push in the configuration. Cuba has a example for nginx that you could try: https://doc.cuba-platform.com/manual-latest/server_push_settings.html#server_push_settings_using_proxy - important is the part about upgrade
I'm more experienced with apache httpd and there it is a must have to configure websockets corretly to work in corporate networks.
Não vejo nada relacionado ao push na configuração. Cuba tem um exemplo para nginx que você pode tentar: https://doc.cuba-platform.com/manual-latest/server_push_settings.html#server_push_settings_using_proxy - importante é a parte sobre
upgrade
Eu sou mais experiente com apache httpd e aí é necessário configurar websockets corretamente para trabalhar em redes corporativas.
I added your suggestion in NGINX. Good news: so far, the problem hasn't happened yet. I'll leave it running during the day and come back here to confirm if it worked or not.
Thank you very much!
I don't see anything related to push in the configuration. Cuba has a example for nginx that you could try: https://doc.cuba-platform.com/manual-latest/server_push_settings.html#server_push_settings_using_proxy - important is the part about
upgrade
I'm more experienced with apache httpd and there it is a must have to configure websockets corretly to work in corporate networks.
Sorted out! Big help. I accessed the CUBA link and used the "location" part, it seems that there is something related to WebSocket support. Since putting it on, I haven't had any more problems. @knoobie Thank you so much for your help, it saved me from several nights sleep!
I share here my NGINX that I'm using for other users, and I won't close the ticket so that someone from Vaadin can evaluate if any improvement is needed in relation to the theme. I believe this could be in the documentation.
Here's the complete file:
server {
server_name mywebsite.com;
location / {
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 3600;
proxy_connect_timeout 240;
proxy_set_header Host $host;
proxy_set_header X-RealIP $remote_addr;
proxy_pass http://127.0.0.1:PORT_EXIT_DO_YOUR_SPRINGBOOT;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
listen [::]:443 ssl; # managed by Certbot (Certificate SSL)
listen 443 ssl; # managed by Certbot (Certificate SSL)
ssl_certificate /etc/letsencrypt/live/mywebsite.com/fullchain.pem; # managed by Certbot (Certificate SSL)
ssl_certificate_key /etc/letsencrypt/live/mywebsite.com/privkey.pem; # managed by Certbot (Certificate SSL)
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot (Certificate SSL)
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot (Certificate SSL)
}
server {
if ($host = mywebsite.com) {
return 301 https://$host$request_uri;
} # managed by Certbot
listen 80;
listen [::]:80;
server_name mywebsite.com;
return 404; # managed by Certbot (Certificate SSL)
proxy_read_timeout 600;
proxy_connect_timeout 600;
proxy_send_timeout 600;
}
I believe this could be in the documentation.
How to configure a reverse proxy should be an important topic inside the docs (cc: @tarekoraby)
I have bad news. Monitoring since that day and it happened again. Less often than before the NGINX change but it is happening. Any other possible problems?
Same here
That is terrible. We never had problems like this in the old Vaadin 8 application. Imagine the inconvenience this is causing the customer. This image makes me shiver every time I see it:
data:image/s3,"s3://crabby-images/2bceb/2bceb4999c43819304fc2ff0be133b9f46eac506" alt="Captura de Tela 2022-08-09 às 09 29 42"
Please someone help us!
Remove the documentation topic, not only that, it's something very serious that needs to be investigated.
I would personally try to obtain a debug log of nginx to try to understand what's happening.
Eu pessoalmente tentaria obter um log de depuração do nginx para tentar entender o que está acontecendo.
I am fully available to collect the data needed to resolve this issue.
Help me, how do I do this?
I'm not an nginx expert, but I'd check the docs for instructions on that: https://nginx.org/en/docs/debugging_log.html.
Any update? Still happens sometimes and needs to refresh the page for the application work
We are going to investigate it more closely in the upcoming development iteration.
I've managed to reproduce this issue almost consistently. There are a couple of interesting factors that lead into this. I have a pcap and a google chrome debug output and the code that generates the issue.
java.lang.UnsupportedOperationException: Unexpected message id from the client. Expected sync id: 9, got 10. more details logged on DEBUG level.
On the following above, I noticed that in both pcap and and the Google Developer network tab that only a sync for 8 and 10 were generated from the server.
In the PCAP I can see that the Websocket port actually generates a FIN packet in between of both the id 8 packet being generated and id 10. Further data however is still being sent on the socket, which is technically fine. I am unclear why the server thinks that Id 9 got generated. But in the specific instance I looked up the FIN was generated and might be related. As a general note, in reading about this issue in other places, it mentions network quality, and I agree to this fact. We're having latency around 300ms to the server and client and slightly high packet loss ~30%. As this is TCP however, it really shouldn't be impacting the order and number of packets being finally received by server and client as re-transmissions should end up succeeding.
For most of my code I am already using my UI code as ui.access(command)
. I was also using @Push(PushMode.AUTOMATIC)
. I am unclear if this is related.
I moved to using @Push(PushMode.MANUAL)
with ui.push after the ui.access and I have not been able to reproduce this problem straight after.
I have a PCAP for this I would prefer to hand it over to someone at Vaadin directly as there is likely confidential data within the data.
My suspicion therefore is that the Push automated sync messages logic has some server side bug.
This was tested on version 22.0.2
@kagian Please share it, if you wanna see steady progress in this topic.
Our multi-years Vaadin 7->14 migration work just went in production.
During dev/tests period, we had occasionnally this problem, principally when remote working, with a bad network. But now we have more users, and users with various network configuration, we realize that the problem is more serious than expected. We speak a lot of NGinx in this thread, my feeling is that this problem is not reverse-proxy related. For sure we have an NGinx in front of the app, but we also access directly the tomcat port, and we can see the problem at the same frequency. My short term goal is to produce a small project that reproduces the problem, for the Vaadin team. I'll probably use one of those browser extensions that simulates a bad network.
This problem is very critical, and as someone said, it wasn't observable with the old 7/8 vaadin platform
We speak a lot of NGinx in this thread, my feeling is that this problem is not reverse-proxy related.
@flefebure, @kagian, @jonasrotilli The problem can be reverse-proxy related, but there are number of other possible causes as well. E.g. slow VPN, flaky Wifi / cellular network etc.
Also framework corner case bugs are a possibility, it is not long ago we introduced this fix, so I recommend to use the latest Vaadin 14 or 23 version, and observe if is more stable in your environment.
https://github.com/vaadin/flow/pull/13733
Just to clarify here, the protocol in use seemed to be TCP as far as I could see. There is also no proxy or additional rewrite components in the path. So for an end to end TCP session, there really should not be the possibility of loss without recovery. And as such it should not be possible to get an out of sequence or dropped packets (unless this really is in UDP, which didn't seem like it).
It really does look like some kind of state tracking issue in Vaading server side. I'll check in on the new version, however, the current change to moving over to manual push updates seems to work well and as a result has reduced the want to change this again on our side.
@tatulund we just upgraded Vaadin 14.8.4->14.8.17 et Flow 2.7.11->2.7.20 Now we wait for users feedback [cross fingers]
To complete: It decreased a lot after I made the suggestions for changes in NGINX that I posted there at the beginning. But it didn't completely solve it. It started to happen more often. No changes to the version of either Vaadin or any changes to the server in question. I believe this needs further investigation.
Any update about this?
It happens more often with the last release. It's impossible to deliver or update a vaadin application in production with this issue.
I have another application running vaadin 8 in production and this doesn't happen. Use the same NGINX reverse proxy and the same configuration, without any problem.
I have the same issue and my customers are complaining. This occurs since I've upgraded from Vaadin 8 to Vaadin 23 (never happened on 8). I'm now running V23.2.2 and the problem is still there.
The app is running on Tomcat 9 in Azure with an Azure AppGateway acting as a proxy.
I see the log [http-nio-8080-exec-4] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
I've just increased the request-time-out on the Proxy to see if it changes something.
I've also seen cases where, on the client side, I have the loading indicators that keeps resetting as if the page was beeing reloaded but, on the server side, I do not see any incoming request ... so it's as if the client side was looping on itself ...
In my case, just stop the refresh lopping page if I force it to refresh.
In my case, just stop the refresh lopping page if I force it to refresh.
yes this works most of the time. But asking a customer to do that is not an option ... I also encountered cases where I had to clear all the stored data (session etc) and restart navigation before I could reach the site again :(
Any update about this?
It happens more often with the last release. It's impossible to deliver or update a vaadin application in production with this issue.
I have another application running vaadin 8 in production and this doesn't happen. Use the same NGINX reverse proxy and the same configuration, without any problem.
Agreed I never experienced this issue with Vaadin 8, with 23 it's happening very often and my customers are becoming angry. I hope the fix will be delivered quickly