at_client_sdk
at_client_sdk copied to clipboard
Notifications get hung up as secondary syncs
Describe the bug When sending frequent updates (say 2hertz) there is noticeable delay in notifications received in the client.
To Reproduce Steps to reproduce the behavior:
- First I send a lot of notifications to a secondary (I used the GPS Demo code)
- Then I connect a client and subscribe to nofications
- As the client is synching no notifications are being dealt with and so they get hung up until sync completes
Expected behavior Notifications should not be held up by the sync process
Screenshots
https://user-images.githubusercontent.com/6131216/191070158-9d74f280-b9a7-418d-b13d-f1876a17af92.mp4
Were you using an atApplication when the bug was found?
- GPS Demo is what I used
Additional context Add any other context about the problem here.
Much improved by removing the old hive boxes on the receiver and changing the onboarding Pref to
//onboarding preference builder can be used to set onboardingService parameters
AtOnboardingPreference atOnboardingConfig = AtOnboardingPreference()
..hiveStoragePath = '$homeDirectory/.$nameSpace/$receivingAtsign/$deviceName/storage'
//..syncBatchSize = 1
..syncIntervalMins = 10
..syncRegex = "ai6bh"
..namespace = nameSpace
..downloadPath = '$homeDirectory/.$nameSpace/files'
..isLocalStoreRequired = true
..commitLogPath = '$homeDirectory/.$nameSpace/$receivingAtsign/$deviceName/storage/commitLog'
..fetchOfflineNotifications = false
..monitorHeartbeatInterval = Duration(seconds: 60)
..atKeysFilePath = atsignFile;
```
After leaving over night the symptoms slowly come back, what I also observed is that the monitor can also die in this hang which leads to more delays.
WARNING|2022-09-20 13:55:38.408682|Monitor|Heartbeat response not received within expected duration. Heartbeat was sent at 1663682137536, it is now 1663682138408, last heartbeat response was received at 1663682114992. Will close connection, set status stopped, call retryCallback
INFO|2022-09-20 13:55:38.409651|Monitor|socket.listen onDone called. Will destroy socket, set status stopped, call retryCallback
INFO|2022-09-20 13:55:38.410344|Monitor|Another closeStopAndRetry operation is in progress
INFO|2022-09-20 13:55:38.468482|SyncService|Returning the serverCommitId 14291548**
I failed to put a SyncProgressListener in place.. Not sure of the correct way to do it on the GPS code.. Let me know when you can and I will add and run a new test over night.. @VJag @kalluriramkumar
Hi @cconstab. Can you please check "at_functional_test/test/atclient_sync_callback_test.dart" in SDK.
Yup that's where I was looking earlier but I could not see understand how to get a progress Listener and then to use it..
Sorry for being a noob !
I failed to put a SyncProgressListener in place.. Not sure of the correct way to do it on the GPS code.. Let me know when you can and I will add and run a new test over night.. @VJag @kalluriramkumar
Added sync progress listener and pushed the changes to a new branch add-sync-progress-listener
.
Following are the changes added: https://github.com/atsign-foundation/mwc_demo/commit/2f1756835206cf28a9937463279ac974b98d2d64
@VJag @kalluriramkumar.. This might be significant in that when starting up a flutter app the sync runs (in my case as you onboard the atSign) and during that time the whole app "locks" up as soon as the second message apears things move along..
flutter: INFO|2022-09-21 21:50:20.154337|AtLookup|auth success
BECOMES "LOCKED"
flutter: INFO|2022-09-21 21:50:20.213338|SyncService|Returning the serverCommitId 16919324
flutter: INFO|2022-09-21 21:50:27.390809|SyncService|Returning the serverCommitId 16919324
flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Inside syncComplete. syncRequest.requestSource : SyncRequestSource.app ; syncRequest.onDone : Closure: () => Null
flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Sending result to onDone callback
BACK AND RESPONSIVE
This is repeatable, but only with atsigns that have had a lot of commitIds (16919324 in this case) or just have had a lot of notifications from what I am seeing. Not sure if it is one or the other causing the issue ?
Q. WHy do we see the same log line twice ??
@VJag @kalluriramkumar.. This might be significant in that when starting up a flutter app the sync runs (in my case as you onboard the atSign) and during that time the whole app "locks" up as soon as the second message apears things move along..
flutter: INFO|2022-09-21 21:50:20.154337|AtLookup|auth success BECOMES "LOCKED" flutter: INFO|2022-09-21 21:50:20.213338|SyncService|Returning the serverCommitId 16919324 flutter: INFO|2022-09-21 21:50:27.390809|SyncService|Returning the serverCommitId 16919324 flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Inside syncComplete. syncRequest.requestSource : SyncRequestSource.app ; syncRequest.onDone : Closure: () => Null flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Sending result to onDone callback BACK AND RESPONSIVE
This is repeatable, but only with atsigns that have had a lot of commitIds (16919324 in this case) or just have had a lot of notifications from what I am seeing. Not sure if it is one or the other causing the issue ?
Q. WHy do we see the same log line twice ??
@cconstab : Found the potential cause for sync to trigger. Fixed the issue and changes are pushed to the following branch : ignore_latestNotificationKey_from_sync
Can you please point to the above at_client SDK and test?
Testing Notes: Tested on @87bluebaton
and the when receiving the notifications, the sync is not triggered. Received 208 notifications and server commit ID is at 6.
@87bluebaton@stats:3
data: [{"id":"3","name":"lastCommitID","value":"6"}]
@87bluebaton@stats:11
data: [{"id":"11","name":"NotificationCount","value":"{\"total\":208,\"type\":{\"sent\":0,\"received\":208},\"status\":{\"delivered\":0,\"failed\":0,\"queued\":0},\"operations\":{\"update\":208,\"delete\":0},\"messageType\":{\"key\":2,\"text\":206},\"createdOn\":1663849517286}"}]
For sure thanks
I think that's solves one thing but does not solve the lock up when you hit a high number of commits..
This fix prevents unnecessary sync's hence there is no commit log and the related logic to compare commit id's. However, murali is working on implementing commit log compaction. The ticket for it is https://github.com/atsign-foundation/at_client_sdk/issues/553.
@VJag @kalluriramkumar.. This might be significant in that when starting up a flutter app the sync runs (in my case as you onboard the atSign) and during that time the whole app "locks" up as soon as the second message apears things move along..
flutter: INFO|2022-09-21 21:50:20.154337|AtLookup|auth success BECOMES "LOCKED" flutter: INFO|2022-09-21 21:50:20.213338|SyncService|Returning the serverCommitId 16919324 flutter: INFO|2022-09-21 21:50:27.390809|SyncService|Returning the serverCommitId 16919324 flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Inside syncComplete. syncRequest.requestSource : SyncRequestSource.app ; syncRequest.onDone : Closure: () => Null flutter: INFO|2022-09-21 21:50:34.531585|SyncService|Sending result to onDone callback BACK AND RESPONSIVE
This is repeatable, but only with atsigns that have had a lot of commitIds (16919324 in this case) or just have had a lot of notifications from what I am seeing. Not sure if it is one or the other causing the issue ? Q. WHy do we see the same log line twice ??
@cconstab : Found the potential cause for sync to trigger. Fixed the issue and changes are pushed to the following branch : ignore_latestNotificationKey_from_sync
Can you please point to the above at_client SDK and test?
Testing Notes: Tested on
@87bluebaton
and the when receiving the notifications, the sync is not triggered. Received 208 notifications and server commit ID is at 6.@87bluebaton@stats:3 data: [{"id":"3","name":"lastCommitID","value":"6"}] @87bluebaton@stats:11 data: [{"id":"11","name":"NotificationCount","value":"{\"total\":208,\"type\":{\"sent\":0,\"received\":208},\"status\":{\"delivered\":0,\"failed\":0,\"queued\":0},\"operations\":{\"update\":208,\"delete\":0},\"messageType\":{\"key\":2,\"text\":206},\"createdOn\":1663849517286}"}]
Spent 5 SP to set the at_gps demo in local, find the root cause of the issue and provide a fix.
The following https://github.com/atsign-foundation/at_client_sdk/issues/750 issue is the potential cause for the notifications getting hung when the sync runs and PR: https://github.com/atsign-foundation/at_client_sdk/pull/760 has the fix
This is working as it should now thanks to @sitaram-kalluri