Discussion: What should be the correct behavior on hibernating browser tabs?
Context
Browser can "hibernate" browser tabs when they are not active. This ensures these tabs do not use any memory or CPU, the JavaScript process on these hibernated tabs is then stopped.
RxDB uses a leader-election for replication so that when many tabs are open, only one tab runs the replication. This saves a lot CPU power and bandwith.
People often reported that they had problems when the elected leader-tab became hibernated by the browser. This seemingly stopped the replication and caused problems.
To prevent this, RxDB added a hack to prevent tab hibernation if at least one replication is running: https://github.com/pubkey/rxdb/commit/c2c7ea487d1a7689e0bcfced8ae34dbeeb0459e6 But this hack is not good as it stops the device from power saving when the tab is not in use.
Goal
I will remove this hack in the next release. From my testings, the default behavior of RxDB with hibernating tabs is correct: The hibernated tab dies and a new leader is elected. You can test this behavior with the broadcast channel test-page where the leading tab has a crown-icon in the title. You can manually send tabs to hibernation in chrome at chrome://discards/.
Question
Do you have any way to reproduce a case where the default behavior causes problem? What do you think the default behavior of RxDB should be?
Related discussions:
- https://discord.com/channels/969553741705539624/1268285064031113236/1334445658014486559
- https://discord.com/channels/969553741705539624/1319287058220712037/1319287058220712037
- https://discord.com/channels/969553741705539624/994606541116297276/1263886152658849853
This issue is especially affects mobile devices.
I think need the mechanism to switch leader tab to latest active tab, since the latest active tab will have most longer lifecycle until its suspended and after user get backs to website, the new opened tab becomes leader again and this will make it not noticeable for end user and fix the issue.
@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.
@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.
I mostly testing at iOS and from my experience its not hibernating its immediately, its firstly put it in kind of slow mode and I had alot of complains from users that the application does not works well at mobile devices.
I do not using replication, but I had noticed that by my custom WebSocket connection which opened in both tabs and pushs same events to all tabs, but if only leader tab writes data to database and when tabs are switched, the leader was not changed, so I made a hack, that on mobile every tab with active WebSocket connection writing the data to the database and that works enough well, since only 1 tab will work fast.
That is my function to determine if tab should write data to database or not:
const elector = createLeaderElection(channel);
export async function isLeader(){
if( isIOS() ){
return true;
}
if( elector && ! await elector.hasLeader() ) return true;
if( elector && await elector.hasLeader() && elector.isLeader ) return true;
return false;
}
but thats not perfect solution as you can see and I hope that can be done other way.
The "hack" is already there. I can think of two options.
- Keept the hack, make it optional as another Plugin, maintain the hack
import { RxDBLeaderKeepAlivePlugin } from 'rxdb/plugins/leader-election';
addRxPlugin(RxDBLeaderKeepAlivePlugin);
...or RxDBProtectTheLeaderPlugin
- Remove the hack and add a piece of information about the browser behavior and potential risks to the guide/manual with the instructions on how to implement the hack
Hello everyone. Just released version 16.5.0. This version contains the toggleOnDocumentVisible flag on the replication states. When toggleOnDocumentVisible is set, the replication will always run in leader AND the currently vissible tab.
I also added some tests to ensure that the replication works without problems when running from two tabs at the same time. Please test this, when it works for everyone, I will remove the previous click-event-hack and also make toggleOnDocumentVisible=true by default.
Closing this. On the next major version, toggleOnDocumentVisible will be true by default. Discussion still welcomed if you have any ideas on how to improve.
Closing this. On the next major version,
toggleOnDocumentVisiblewill betrueby default. Discussion still welcomed if you have any ideas on how to improve.
@pubkey Hello, thank you for adding toggleOnDocumentVisible. It fixed Socket closed Error for us but unfortunately, we faced some new issues.
Here is a list of devices we tested:
| Device | RAM Type | Storage Type |
|---|---|---|
| Xiaomi 14 | LPDDR5X | UFS 4.0 |
| OnePlus 8 | LPDDR4X | UFS 3.0 |
| Xiaomi Redmi Note 10 Pro | LPDDR4X | UFS 2.2 |
| Google Pixel 4a | LPDDR4X | UFS 2.1 |
| Xiaomi A1 (Mi A1) | LPDDR3 | eMMC 5.1 |
| Tecno KI5k SPARK 10C | LPDDR4X | eMMC 5.1 |
| Redmi 13 (Redmi 14C) | LPDDR4X | eMMC 5.1 |
| Tecno Spark 10 | LPDDR4X | eMMC 5.1 |
| Samsung Galaxy A11 | LPDDR3 | eMMC 5.1 |
For the last few weeks, we’ve had a hard time with our bulkInsert and incrementalModify operations. The delay for these operations has skyrocketed from 200ms to 6000ms.
The first thing we discovered is that all devices using eMMC storage types are facing this issue. eMMC is known for having low random write speeds, which are crucial for SQLite. All phones that used UFS were not affected. After a long debugging session, we figured out that toggleOnDocumentVisible has been causing all of this. We haven’t checked the source code of RxDB yet to understand how it can affect performance so badly, but we’re certain that it is responsible for the additional load on devices, which is very noticeable on low-budget phones.
@space7panda It think it is very unlikely that this comes from toggleOnDocumentVisible and based on the given information I do not think it is possible to reproduce that problem for me.
Maybe you have a custom conflict handler which causes infinite write loops?
@space7panda It think it is very unlikely that this comes from
toggleOnDocumentVisibleand based on the given information I do not think it is possible to reproduce that problem for me.Maybe you have a custom conflict handler which causes infinite write loops?
Thanks, we can double check that but im not sure how conflict Handler is related to our testing where we had 2 separate apk files where:
toggleOnDocumentVisible: false
toggleOnDocumentVisible: true
And also as additional testing we downgraded our RxDb to 15.39.0 and everything worked fine there because toggleOnDocumentVisible hasn't been implemented yet
You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.
You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.
Sure we will check that and elaborate on results
@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.
At first I was thinking this issue is related to something else, but after a very long debugging session I concluded it must be the toggleOnDocumentVisible option, since I've rolled back to v15 (bug absend), went back to 16.0.0 (bug absent), went all the way to 16.11.0 without toggleOnDocumentVisible (bug absent). After enabling toggleOnDocumentVisible the issue happens consitently on BrowserStack Safari iOS.
In our case the app is stuck in a never ending loading cycle and will stay stuck until safari is closed (or all tabs of the app are closed)
My suspicion is that this is related to the replicationState and that start and pause are called and not awaited, so the execution order is:
// important is that none of the functions are actually awaited even though they are async
replicationState.start();
replicationState.pause();
replicationState.start();
@pubkey as a followup I've debugged the whole thing with the logger plugin and those are the operations which aren't finishing:
bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq
bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd
My bug is also NOT happening with the localstorage, dexie and memory storage. - We are using the indexeddb storage atm.
The dexie storage gives some errors which the indexeddb doesn't give, might be related:
RxDB.dexie.dbname.rx-replication-meta-0598bd57c35076cd146223db42020d0e71cedf64117e5ca054c60674c7af053d.findDocumentsById(1) instance:yyvdvdohtj_opId:yzmsworl: ERROR: DatabaseClosedError
RxDB.dexie.dbname.rx-replication-meta-e547ce34efcc417fc7e2b633d5b2c6432a24131a21395b0c961e73b3833688ac.findDocumentsById(1) instance:yyvdvdohtj_opId:ejxlvkom: ERROR: DatabaseClosedError
RxDB.dexie.dbname.rx-replication-meta-49a45dd831c7fdd1c2960ba31ce678837216f5da7d7be055a2efb24840097009.findDocumentsById(1) instance:yyvdvdohtj_opId:txrjajph: ERROR: DatabaseClosedError
I'm happy to give further data / assist you in fixing this issue.
@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.
I do not see any reason why a call to storageInstance.bulkWrite() can hang up, it should either resolve at some point or throw an error.
Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?
Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?
I couldn't reproduce the hanging behavior in safair with localstorage, dexie and memory storage. The error I posted above was form the dexie storage but the hanging didn't occur. The indexeddb storage didn't give any errors but it hang the browser.
The PR #7095 fixed the hanging issue for us, but we still think that if one does replicationState.start() at a inconvenient point in time the issue would return.
@pubkey We implemented and deployed the fix from PR #7095 and we are getting sentry error reports from the replicationState.pause function:
ensureNotFalsy() is falsy:
So before releasing a new version of rxdb it might be beneficial to wrap the whole thing in a try catch block?
No a try-catch is not the correct solution. We should detect where this comes from. ensureNotFalsy should never ever throw on runtime. This is only used to satisfy typescript.
@pubkey its thrown here: https://github.com/pubkey/rxdb/blob/df21fdcdbfe434dc505d93982b756b0f2c87a251/src/plugins/replication/index.ts#L421 (this.internalReplicationState is undefined)
So maybe before replicationState.pause is being called there should be additional checks or the pause function just does nothing in that case
If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your DatabaseClosedError from before.
If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your
DatabaseClosedErrorfrom before.
@pubkey well, before replicationState.pause() was never called in the visibilitychange event because of the issue fixed in PR #7095 so the DatabaseClosedError (at least at that time) didn't came form the replicationState.pause() in that event.
I guess its a good idea to implement that, although I don't have the capacity to make PR for now
@KingSora @pubkey our issue also related to start() and pause() but in a bit different way
We are using Capacitor with RxDb and start() pause() are triggered for us in 2 scenarios:
- when we launch camera
- when apps get minimized and maximised
Basically users can hide and open app 10 times in a row which will cause 10X combo of start() pause() and thats why we had that big lag on budget phones
@KingSora @pubkey our issue also related to
start()andpause()but in a bit different wayWe are using Capacitor with RxDb and
start()pause()are triggered for us in 2 scenarios:
- when we launch camera
- when apps get minimized and maximised
Basically users can hide and open app 10 times in a row which will cause 10X combo of
start()pause()and thats why we had that big lag on budget phones
then we tried to simulate that with following pseudo code:
App.addListener('appStateChange', ({ isActive }) => {
....
replicationState.start();
....
replicationState.pause();
....
});
And we got same results with a bit less lag
@space7panda I could not reproduce your problem. Can you make a PR with a test case that simulates this behvior and shows that it causes errors?