replicache
replicache copied to clipboard
"The database connection is closing." error again (10.x specific?)
Le sigh. See: https://discord.com/channels/830183651022471199/966042424273150033
Tasks:
- [ ] When this error happens (and the Replicache instance hasn't been closed) try to reopen the connection to the idb database. If we find the database is still there resume normal operation. If we find the database has been deleted throw a specific Error indicating this. This change will: a) automatically recover when possible and b) enable discerning between connections randomly closing (which Google search results suggests does happen) and connections closing due to idb database being deleted.
- [ ] If (1) proves useful, add an explicit error callback to Replicache for notifying the developer when we detect the idb database has been deleted (similar to https://doc.replicache.dev/api/classes/Replicache#onclientstatenotfound).
- [ ] Ensure that all api entry points query/mutator/etc check if Replicache is already closed and throw a specific Error in this case.
- [ ] Ensure that all background processes check if Replicache is already closed before trying to access idb (we generally do this but our probably missing a case or two).
The overall dataset is small, so this might change, but so far this error has only occurred on Chrome 100.
I'm intermittently see this in tests too:
▶ npm run test
> [email protected] test
> web-test-runner
src/replicache-mutation-recovery.test.ts:
❌ successfully recovering mutations of client with same schema version and replicache format version (failed on Webkit)
InvalidStateError: Failed to execute 'transaction' on 'IDBDatabase': The database connection is closing.
at readImpl (src/kv/idb-store.ts:158:28)
at src/kv/idb-store.ts:35:26
Firefox: |██████████████████████████████| 50/50 test files | 338 passed, 0 failed, 8 skipped
Chromium: |██████████████████████████████| 51/51 test files | 338 passed, 0 failed, 9 skipped
Webkit: |██████████████████████████████| 50/50 test files | 337 passed, 1 failed, 8 skipped
Finished running tests in 17.2s with 1 failed tests.
@grgbkr
One way to make these go away is to prevent concurrency in the test runner -- but we do want to support concurrency so we need to better deal with exceptions during closing of Replicache. One possibility would be to pass in the AbortSignal through many more layers...
I'm getting this error frequently in local dev. It might be related to hot-reloading. I still haven't deployed / tested a prod version, but it's a bit annoying in dev.
My Chrome version is 102.0.5005.115 (Official Build) (arm64).
Plan to investigate this more today.
@koahmad Can you share the details of your local dev setup?
Yes -- I'm using NextJS (10.2.3) and use next dev
to start the app.
What else would be helpful to you?
@koahmad thanks, that is enough to help me get started on reproing.
More information on this. Another of our customers reports they see this error in their error reporting from prod.
The browser breakdown is
Errors from one session:
This can happen in production when
- the idb is deleted a. via code b. via devtools c. or by the browser due to hitting size quota
- close is called on the replicache instance, and then apis like replicache.query, or replicache.mutate.* are used.
- the browser closes the idb connection for some other reason (it shouldn't but google results suggests it randomly does).
For 1 we can't recover, but we could detect and notify the developer via an error callback (similar to https://doc.replicache.dev/api/classes/Replicache#onclientstatenotfound).
For 2 we should check if already closed in query/mutate/etc, and throw a specific error.
For 3 we should try to recover by re-openning the idb conneciton.
Perhaps we could collapse (1) and (3) by trying to reopen and if we find that the db is gone (😬) notifying user.
Wanted to share an update on our current understanding of the impact of this bug when it occurs in production. If the client view is < 100MBs in size the impact will be largely invisible to end users.
When the client view is < 100MB the impact for the affected tab is:
- pending mutations are not persisted locally, and so if the tab is closed before pending mutations are pushed they will be lost (as mutation recovery can not recover non-persisted mutations)
- newly synced snapshots are not persisted, resulting in new tabs having to sync a bit more on start up
Other than the above the client will function normally. Mutators, querys, subscribe and watches all continue to work. Sync will continue to work. This is because all of these work against a memory cache that is lazyily persisted to IndexedDB. If the connection to IndexedDB is lost they all continue to work correctly against the memory cache and things just fail to persist.
If the client view is > 100MBs, there is more likely to be end user visible impact. Currently Replicache's memory cache has a size limit of 100MB (using a LRU eviction strategy). If the IndexedDB connection is lost and there is a memory cache miss during execution of a Replicache operation (mutator, query, subscribe, watch, push or pull), then that operation will fail.
Ok I think we now have a solid plan for addressing this:
- When this error happens (and the Replicache instance hasn't been closed) try to reopen the connection to the idb database. If we find the database is still there resume normal operation. If we find the database has been deleted throw a specific Error indicating this. This change will: a) automatically recover when possible and b) enable discerning between connections randomly closing (which Google search results suggests does happen) and connections closing due to idb database being deleted.
- If (1) proves useful, add an explicit error callback to Replicache for notifying the developer when we detect the idb database has been deleted (similar to https://doc.replicache.dev/api/classes/Replicache#onclientstatenotfound).
- Ensure that all api entry points query/mutator/etc check if Replicache is already closed and throw a specific Error in this case.
- Ensure that all background processes check if Replicache is already closed before trying to access idb (we generally do this but are probably missing a case or two).
We will do (1) this week, and (2) shortly after.
We should hold off on (3) and (4) until after https://github.com/rocicorp/replicache/issues/1009. The full offline support change involves a lot of significant refactors, and I worry if we do this before offline there is a non-trivial chance of those refactors regressing it. Also (3) and (4) won't improve anything for end users, they just provide nicer error reporting for developers.
I think eventually there should also be a (5) which is a bigger project:
- some sort of first-class feature around managing quota and eviction.
I think eventually there should also be a (5) which is a bigger project:
- some sort of first-class feature around managing quota and eviction.
Agree, though I'd track that feature as a separate issue.
Adding some relevant bug reports about unexpected idb connection closes from other projects: https://bugs.webkit.org/show_bug.cgi?id=197050 https://bugs.chromium.org/p/chromium/issues/detail?id=1085724 https://github.com/jensarps/IDBWrapper/issues/80 https://github.com/firebase/firebase-js-sdk/issues/1533
Update: There's a new beta release of Replicache: 11.2.1
that contains code to help us understand what's happening here. If you are seeing this error, please install:
npm add [email protected]
Then deploy to production. You should see one of two things happen:
- The error goes away. In that case, the rumors are true: Chrome will sometimes mysteriously close connections to IDB and just reopening it fixes it 🤷♂️.
- The error changes to
Replicache IndexedDB not found: <name>
. In that case, the databases are getting deleted -- perhaps due to quota or some other reason.
Please let us know what happens. Thanks!
So, it went away, it seemed, but it's back - still seeing these issues cropping up in development.
What version of Replicache?
At currently:
=> Found "[email protected]"
I'm still pretty interested in a version of Replicache that doesn't use IDB. I really think the costs outweigh the benefits of the local-storage approach for my usecase, and the browser implementations of IDB are apparently really flawed.
@grgbkr said he had some ideas why there is still a possible case of this.
Some good new on this. We found a way to easily reproduce this by using replicache in a codesandbox. The super aggressive HMR makes it easy to reproduce. Now that we can repro, we can more reliably debug and fix.