SSS got completely stuck
Steps to reproduce
- opened app on slightly dodgy connectivity (1 bar of wifi)
- roomlist showed stale rooms from hours ago
- timelines within rooms showed stale history too
- waited a while to see if a spinner would turn up, or history would resync, despite moving onto good connectivity
- no spinner; no sync
Outcome
What did you expect?
There should be a spinner if you are staring at stale history wondering if it's stale or not.
Sync should not get stuck due to bad connectivity, but retry when connectivity recovers.
What happened instead?
Stuck sync, with zero UI feedback to tell you you're offline or looking at stale info.
Your phone model
No response
Operating system version
No response
Application version
669
Homeserver
No response
Will you send logs?
Yes
Ah, this happened because I restarted the server which blew away the in-memory cache of which rooms we'd sent down. This caused it to basically try and send down all your rooms again.
We need to migrate the per-connection state to the DB, but for now: https://github.com/element-hq/synapse/pull/17529
PR has landed and been deployed
I feel like this should be re-opened to address better UI feedback:
zero UI feedback to tell you you're offline or looking at stale info.
I also want to point out that if @erikjohnston's investigation is correct, /sync wasn't completely stuck, just slow because the client is asking for a full range of rooms, and without the cache to tell whether a room has been sent down the connection before, we end up sending down all rooms and their state from scratch (which can be very slow). With https://github.com/element-hq/synapse/pull/17529, we expire the connection and allow the client a chance to do an initial request with a smaller range of rooms to get them some results sooner but will end up taking the same amount of time (more with round-trips and re-processing) in the end to get everything again.
so it may be a different cause, but i just got this again
@ara4n can you send a rageshake when it happens again?
@ara4n is it still an issue? If yes, send a rageshake when you encounter the bug.
Given the super positive feedback we got in an internal room on the Oct, 23th. I am closing it.