Away and offline presence information do not update
Description:
This is a stub issue. I'll update it when I can reproduce it more reliably.
RC web client end stops updating timeout and connection based presence information (away and offline).
Refreshing the web client seems to fix the issue.
Steps to reproduce:
- Have two users A and B in separate browsers
- Set user B timeout for
awaystatus to minimum (60 seconds) in B's profile - Don't focus for user B's browser anymore
- Follow presence status of user B as user A in the other browser
Expected behavior:
User B's green online presence indicator changes to orange away status after 60 seconds.
Actual behavior:
User B's status remains online forever.
Server Setup Information:
- Version of Rocket.Chat Server: RC 7.11.0
- License Type: Enterprise
- Number of Users: 500+
- Operating System: RHEL8
- Deployment Method: docker
- Number of Running Instances: 4
- DB Replicaset Oplog: Yes
- NodeJS Version: v22.16.0
- MongoDB Version: 7.0.25
Client Setup Information
- Desktop App or Browser Version: Latest Chrome and Firefox
Additional context
Refreshing A's browser window fixes B's false presence status for A.
The issue seems to relate only to automated status updates; If the user B sets the status manually to away, busy, or offline, it updates normally also on A's browser, even when the issue remains active otherwise.
Hi, I’d like to work on this issue. Could you please assign it to me?
Have you found out what causes it?
Hi, I’d like to work on this issue. Could you please assign it to me?
In a word, no.
Read all this.
Thanks.
User B's status remains online forever.
Think this has been an on/off issue for ages?
When I'm home next week I'll look a bit more.
I'm also testing with NATS server in our RC deployment. This may or may not have effect on this issue. I'll report back when there is more experience on its effect.
Trigger for this testing was that NATS is recommended by devs for multi-instance setups: https://forums.rocket.chat/t/troubleshooting-event-propagation-reactivity-issues-in-rocket-chat-7-7-0/22524
So far no spotted recurrence for this issue with NATS config. For those who run without likely help would be correctly setting TCP_PORT env for internal mesh communication.
TCP_PORT for reference
https://github.com/RocketChat/Rocket.Chat/issues/36268#issuecomment-3461850753
This issue repeated for me. The issue is on the server side because the database shows user being online when I was not. (All clients in closed state from the boot, including mobile client.) Also two separate accounts and computers at different public IP addresses were used to analyse that my account remained in online state when it should not.
Checking and cleaning up userSessions and rocketchat_sessions collections did not reveal additional sessions active for my account.
Getting online and switching my status to away or offline and back again online manually worked without issues. But again after closing down my single browser client, the status stuck at online.
Server side operations that I did afterwards:
- Restarting reverse proxy
- Restarting instances
- Restarting NATS service
- Restarting mongodb
After these the status started updating properly again (after I had opened the web client once to recreate a new session).
In retroperspective and for the next occurrence of the issue it is best to do restarts one by one to figure out the real cause. I very much suspect it can be a stuck connection between any of these components, and such connection could be registered to some database collection I haven't yet figured out.
If the devs can point out any such collection apart from userSessions and rocketchat_sessions that maps the offline state and connections, the information would be welcome.
I have not witnessed this anymore with RC 7.12.2 or RC 7.13.0 and after increasing OS nofiles limits and reverse proxy worker_connections limits from the default 1024 to higher. When these were clogged it messed up API calls badly.
I'm closing this one for now.
Opening this again. I witnessed this in the evening after the update to RC 7.13.1 that same morning.
This may be related either to:
- In some circumstances user using manual offline status and then setting it online again.
- Updating process afterwards
reloadthing, accompanied with refreshing (F5 or shift-reload) before or after it, or deep cache. Because I often do manual reloads after updates to ensure everything is in order.
When the issue is on, no amount of F5 or shift-reload helped. I suspect logout-login or deleting all cache from browser would have helped, though.
Usually the issue has cleared itself out the next day.
We are running NATS and I also ensured that there was only one instance running to ensure it was not a streaming issue between instances.