rxdb Discussion: What should be the correct behavior on hibernating browser tabs?

Context

Browser can "hibernate" browser tabs when they are not active. This ensures these tabs do not use any memory or CPU, the JavaScript process on these hibernated tabs is then stopped.

RxDB uses a leader-election for replication so that when many tabs are open, only one tab runs the replication. This saves a lot CPU power and bandwith.

People often reported that they had problems when the elected leader-tab became hibernated by the browser. This seemingly stopped the replication and caused problems.

To prevent this, RxDB added a hack to prevent tab hibernation if at least one replication is running: https://github.com/pubkey/rxdb/commit/c2c7ea487d1a7689e0bcfced8ae34dbeeb0459e6 But this hack is not good as it stops the device from power saving when the tab is not in use.

Goal

I will remove this hack in the next release. From my testings, the default behavior of RxDB with hibernating tabs is correct: The hibernated tab dies and a new leader is elected. You can test this behavior with the broadcast channel test-page where the leading tab has a crown-icon in the title. You can manually send tabs to hibernation in chrome at chrome://discards/.

Question

Do you have any way to reproduce a case where the default behavior causes problem? What do you think the default behavior of RxDB should be?

Related discussions:

https://discord.com/channels/969553741705539624/1268285064031113236/1334445658014486559
https://discord.com/channels/969553741705539624/1319287058220712037/1319287058220712037
https://discord.com/channels/969553741705539624/994606541116297276/1263886152658849853

Feb 01 '25 10:02 pubkey

This issue is especially affects mobile devices.

I think need the mechanism to switch leader tab to latest active tab, since the latest active tab will have most longer lifecycle until its suspended and after user get backs to website, the new opened tab becomes leader again and this will make it not noticeable for end user and fix the issue.

Feb 01 '25 10:02 andreuka

@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.

Feb 01 '25 10:02 pubkey

@andreuka But on mobile, if a tab is hibernated, doesnt it also elect a new tab as leader? Because in my testings it does.

I mostly testing at iOS and from my experience its not hibernating its immediately, its firstly put it in kind of slow mode and I had alot of complains from users that the application does not works well at mobile devices.

I do not using replication, but I had noticed that by my custom WebSocket connection which opened in both tabs and pushs same events to all tabs, but if only leader tab writes data to database and when tabs are switched, the leader was not changed, so I made a hack, that on mobile every tab with active WebSocket connection writing the data to the database and that works enough well, since only 1 tab will work fast.

That is my function to determine if tab should write data to database or not:

const elector = createLeaderElection(channel);

export async function isLeader(){
    if( isIOS() ){
        return true;
    }

    if( elector && ! await elector.hasLeader() ) return true;
    if( elector && await elector.hasLeader() && elector.isLeader ) return true;

    return false;
}

but thats not perfect solution as you can see and I hope that can be done other way.

Feb 01 '25 11:02 andreuka

The "hack" is already there. I can think of two options.

Keept the hack, make it optional as another Plugin, maintain the hack

import { RxDBLeaderKeepAlivePlugin } from 'rxdb/plugins/leader-election';
addRxPlugin(RxDBLeaderKeepAlivePlugin);

...or RxDBProtectTheLeaderPlugin

Remove the hack and add a piece of information about the browser behavior and potential risks to the guide/manual with the instructions on how to implement the hack

Feb 03 '25 20:02 paul-geisler

Hello everyone. Just released version 16.5.0. This version contains the toggleOnDocumentVisible flag on the replication states. When toggleOnDocumentVisible is set, the replication will always run in leader AND the currently vissible tab.

I also added some tests to ensure that the replication works without problems when running from two tabs at the same time. Please test this, when it works for everyone, I will remove the previous click-event-hack and also make toggleOnDocumentVisible=true by default.

Feb 04 '25 23:02 pubkey

Closing this. On the next major version, toggleOnDocumentVisible will be true by default. Discussion still welcomed if you have any ideas on how to improve.

Apr 01 '25 12:04 pubkey

Closing this. On the next major version, toggleOnDocumentVisible will be true by default. Discussion still welcomed if you have any ideas on how to improve.

@pubkey Hello, thank you for adding toggleOnDocumentVisible. It fixed Socket closed Error for us but unfortunately, we faced some new issues.

Here is a list of devices we tested:

Device	RAM Type	Storage Type
Xiaomi 14	LPDDR5X	UFS 4.0
OnePlus 8	LPDDR4X	UFS 3.0
Xiaomi Redmi Note 10 Pro	LPDDR4X	UFS 2.2
Google Pixel 4a	LPDDR4X	UFS 2.1
Xiaomi A1 (Mi A1)	LPDDR3	eMMC 5.1
Tecno KI5k SPARK 10C	LPDDR4X	eMMC 5.1
Redmi 13 (Redmi 14C)	LPDDR4X	eMMC 5.1
Tecno Spark 10	LPDDR4X	eMMC 5.1
Samsung Galaxy A11	LPDDR3	eMMC 5.1

For the last few weeks, we’ve had a hard time with our bulkInsert and incrementalModify operations. The delay for these operations has skyrocketed from 200ms to 6000ms.

The first thing we discovered is that all devices using eMMC storage types are facing this issue. eMMC is known for having low random write speeds, which are crucial for SQLite. All phones that used UFS were not affected. After a long debugging session, we figured out that toggleOnDocumentVisible has been causing all of this. We haven’t checked the source code of RxDB yet to understand how it can affect performance so badly, but we’re certain that it is responsible for the additional load on devices, which is very noticeable on low-budget phones.

Apr 07 '25 10:04 space7panda

@space7panda It think it is very unlikely that this comes from toggleOnDocumentVisible and based on the given information I do not think it is possible to reproduce that problem for me.

Maybe you have a custom conflict handler which causes infinite write loops?

Apr 07 '25 10:04 pubkey

@space7panda It think it is very unlikely that this comes from toggleOnDocumentVisible and based on the given information I do not think it is possible to reproduce that problem for me.

Maybe you have a custom conflict handler which causes infinite write loops?

Thanks, we can double check that but im not sure how conflict Handler is related to our testing where we had 2 separate apk files where: toggleOnDocumentVisible: false toggleOnDocumentVisible: true

And also as additional testing we downgraded our RxDb to 15.39.0 and everything worked fine there because toggleOnDocumentVisible hasn't been implemented yet

Apr 07 '25 11:04 space7panda

You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.

Apr 07 '25 11:04 pubkey

You could use the logger plugin to check what is going on in your storage. Likeky some other writes block your bulk-inserts so they are slow waiting to open a transaction.

Sure we will check that and elaborate on results

Apr 07 '25 11:04 space7panda

@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.

At first I was thinking this issue is related to something else, but after a very long debugging session I concluded it must be the toggleOnDocumentVisible option, since I've rolled back to v15 (bug absend), went back to 16.0.0 (bug absent), went all the way to 16.11.0 without toggleOnDocumentVisible (bug absent). After enabling toggleOnDocumentVisible the issue happens consitently on BrowserStack Safari iOS.

In our case the app is stuck in a never ending loading cycle and will stay stuck until safari is closed (or all tabs of the app are closed)

My suspicion is that this is related to the replicationState and that start and pause are called and not awaited, so the execution order is:

// important is that none of the functions are actually awaited even though they are async
replicationState.start();
replicationState.pause();
replicationState.start();

Apr 18 '25 17:04 KingSora

@pubkey as a followup I've debugged the whole thing with the logger plugin and those are the operations which aren't finishing:

bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(1) instance:yneqilqpvj_opId:nwevzxjq

bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd,bulkWriteRxDB.indexeddb.dbname._rxdb_internal.bulkWrite(6) instance:yneqilqpvj_opId:ntnxzpsd

My bug is also NOT happening with the localstorage, dexie and memory storage. - We are using the indexeddb storage atm.

The dexie storage gives some errors which the indexeddb doesn't give, might be related:

RxDB.dexie.dbname.rx-replication-meta-0598bd57c35076cd146223db42020d0e71cedf64117e5ca054c60674c7af053d.findDocumentsById(1) instance:yyvdvdohtj_opId:yzmsworl: ERROR: DatabaseClosedError

RxDB.dexie.dbname.rx-replication-meta-e547ce34efcc417fc7e2b633d5b2c6432a24131a21395b0c961e73b3833688ac.findDocumentsById(1) instance:yyvdvdohtj_opId:ejxlvkom: ERROR: DatabaseClosedError

RxDB.dexie.dbname.rx-replication-meta-49a45dd831c7fdd1c2960ba31ce678837216f5da7d7be055a2efb24840097009.findDocumentsById(1) instance:yyvdvdohtj_opId:txrjajph: ERROR: DatabaseClosedError

I'm happy to give further data / assist you in fixing this issue.

Apr 19 '25 10:04 KingSora

@pubkey I can confirm that toggleOnDocumentVisible causes problems with bulk writes. - I've recently discovered this because rxdb didn't resolve the db.addCollections promise at all on safari iOS. It precisely is hanging here.

I do not see any reason why a call to storageInstance.bulkWrite() can hang up, it should either resolve at some point or throw an error.

Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?

Apr 23 '25 08:04 pubkey

Since this is reproducible with the dexie storage, it should be possible to make a PR with a test case?

I couldn't reproduce the hanging behavior in safair with localstorage, dexie and memory storage. The error I posted above was form the dexie storage but the hanging didn't occur. The indexeddb storage didn't give any errors but it hang the browser.

The PR #7095 fixed the hanging issue for us, but we still think that if one does replicationState.start() at a inconvenient point in time the issue would return.

Apr 23 '25 09:04 KingSora

@pubkey We implemented and deployed the fix from PR #7095 and we are getting sentry error reports from the replicationState.pause function:

ensureNotFalsy() is falsy:

So before releasing a new version of rxdb it might be beneficial to wrap the whole thing in a try catch block?

Apr 23 '25 11:04 KingSora

No a try-catch is not the correct solution. We should detect where this comes from. ensureNotFalsy should never ever throw on runtime. This is only used to satisfy typescript.

Apr 23 '25 12:04 pubkey

@pubkey its thrown here: https://github.com/pubkey/rxdb/blob/df21fdcdbfe434dc505d93982b756b0f2c87a251/src/plugins/replication/index.ts#L421 (this.internalReplicationState is undefined)

So maybe before replicationState.pause is being called there should be additional checks or the pause function just does nothing in that case

Apr 23 '25 12:04 KingSora

If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your DatabaseClosedError from before.

Apr 23 '25 12:04 pubkey

If .pause is called while .start is still starting up, the .pause call should await the startup-procedure. This likely would also fix your DatabaseClosedError from before.

@pubkey well, before replicationState.pause() was never called in the visibilitychange event because of the issue fixed in PR #7095 so the DatabaseClosedError (at least at that time) didn't came form the replicationState.pause() in that event.

I guess its a good idea to implement that, although I don't have the capacity to make PR for now

Apr 23 '25 12:04 KingSora

@KingSora @pubkey our issue also related to start() and pause() but in a bit different way

We are using Capacitor with RxDb and start() pause() are triggered for us in 2 scenarios:

when we launch camera
when apps get minimized and maximised

Basically users can hide and open app 10 times in a row which will cause 10X combo of start() pause() and thats why we had that big lag on budget phones

Apr 23 '25 13:04 space7panda

@KingSora @pubkey our issue also related to start() and pause() but in a bit different way

We are using Capacitor with RxDb and start() pause() are triggered for us in 2 scenarios:

when we launch camera

when apps get minimized and maximised

Basically users can hide and open app 10 times in a row which will cause 10X combo of start() pause() and thats why we had that big lag on budget phones

then we tried to simulate that with following pseudo code:

App.addListener('appStateChange', ({ isActive }) => {
    ....
    replicationState.start();
    ....
    replicationState.pause();
    ....
});

And we got same results with a bit less lag

Apr 23 '25 13:04 space7panda

@space7panda I could not reproduce your problem. Can you make a PR with a test case that simulates this behvior and shows that it causes errors?

May 10 '25 13:05 pubkey