firebase-js-sdk icon indicating copy to clipboard operation
firebase-js-sdk copied to clipboard

FR: Offline first support for PWAs (RTDB)

Open Hollerweger opened this issue 7 years ago • 100 comments

While the Firebase JS SDK has support for offline scenarios when the web app goes from online to offline it lacks offline first support. Offline first is a crucial part of PWAs and should be supported by the Firebase JS SDK directly.

Hollerweger avatar May 18 '17 19:05 Hollerweger

Hey there! I couldn't figure out what this issue is about, so I've labeled it for a human to triage. Hang tight.

google-oss-bot avatar May 18 '17 19:05 google-oss-bot

I have some ideas of what I think you mean by "offline first support," but the term itself is rather vague especially when viewed in the context of the entire JS SDK as each part of firebase (auth, database, storage, messaging) would have different approaches to "offline first support."

That said, I like what this issue calls attention to, and it is something that I'd love to pursue. Can you help me understand what specific things were difficult for you in doing offline first development?

jshcrowthe avatar May 18 '17 21:05 jshcrowthe

I was thinking about a database persistence similar to what is supported with the Firebase iOS and Androd SDKs available even when the web app is reopened in a new browser tab offline. Right now i need to implement my own offline persistence layer on top of Firebase to support offline first scenarios. There was even a Firebase I/O session last year regarding PWAs and offline first. For this demo Polymer with an index db mirror was used on top because the functionality was not provided by the Firebase JS SDK itself. https://www.youtube.com/watch?v=SobXoh4rb58 With this approach I'm limited on the offline functionality of Polymer without the ability to directly query the Firebase database. Would be great if such an index db mirror could be part of Firebase JS SDK itself.

Hollerweger avatar May 18 '17 22:05 Hollerweger

This is something I have talked to Firebase support about in the past and I was actually just about to open my own issue until I saw this one. When I think about Firebase (at least, my usage of Firebase) and offline functionality I think of storage.

I think what would make the most sense would be to refactor the current implementation to support storage adapters. The current implementation could become the default, "in-memory" adapter. Other community-developed or officially supported adapters could be published as well. IndexedDB is an obvious choice, it's what PouchDB uses by default. A less obvious adapter I would like to implement for use in Electron would be a sqlite adapter.

Just spitballing here, but there could also be a proxy adapter to use two adapters together. For instance, I could use the in-memory adapter along with my sqlite adapter for performance purposes.

The Firebase SDKs were only just recently open sourced. Would these types of features be welcome for pull requests?

knpwrs avatar May 18 '17 23:05 knpwrs

I was looking at achieving this type of functionality with Firebase and Redux-Offline. If the Firebase JS SDK was to be made modular with defaults it seems that it should do so in other layers of the SDK than just storage to achieve all the Offline-first criteria as specified in Redux-Offline EG: like exposing functions for implementing custom reconciliation of optimistic update failures/rollbacks.

jthegedus avatar May 19 '17 05:05 jthegedus

Making the data Offline-first available using react-redux and firebase is no big deal. Here is a working example: https://github.com/TarikHuber/react-most-wanted But: the data is offline available only for the client. The firebase database listeners don't know that you already have most of the data in your local storage. It would be great if firebase itself would manage Offline-first. Maybe they could figure out how to then just load the data that is missing in the local storage and not all of it like it's done with a running app that has connection. For example: you loaded 10 tasks in your application and go offline or close the application. After you reconnect firebase uses hes own cache not only to give you the already loaded 10 tasks but to also just load 2 tasks that where added afterwards and edits to the existing 10.

TarikHuber avatar May 19 '17 08:05 TarikHuber

It's no big deal except you have to manage Redux in addition to Firebase. You have no control over when Firebase syncs to the server, you're restricted by Firebases local cache limits and persisting the cache isn't trivial. And Redux certainly overlaps with what the Firebase SDK does for offline. All could be mitigated should Firebase support a few modules/adapters for Offline-first in a similar method to how Redux-Offline defines. I'm just suggesting that we use Redux-Offline as a guide for what parts could be made modular.

jthegedus avatar May 19 '17 08:05 jthegedus

@knpwrs I think this is something that we could totally accept as a PR! Love to have your contributions. The notion of different storage adapters is also an interesting idea that I'd love to see more details on.

In addition, I'd encourage everyone, for all Feature Requests, to make sure you are signed up for the Firebase Alpha Program where you can keep up on all the upcoming features and products.

jshcrowthe avatar May 24 '17 17:05 jshcrowthe

@jshcrowthe What exactly features and products Firebase Alpha Program offers at this time? I've completed and submitted the form 3 days ago. I wonder how long to wait the admission into the Alpha Program?

FluorescentHallucinogen avatar May 25 '17 13:05 FluorescentHallucinogen

We don't disclose what's in the alpha program (that's kinda the point) and I'm not sure how fast processing applications is. Either way remember that the alpha program is for alpha software which will not be recommended to ship in production.

On Sun, May 28, 2017, 11:32 PM Alexey Rodionov [email protected] wrote:

@jshcrowthe https://github.com/jshcrowthe Please answer.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/firebase/firebase-js-sdk/issues/17#issuecomment-304583039, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAD_sP1YyEDIRZcTafaPGX7LgTih0p5ks5r-mZkgaJpZM4NftA8 .

mbleigh avatar May 29 '17 17:05 mbleigh

+1 to what @mbleigh said.

Lets try and keep this thread on topic though 😄 . Further questions on the Alpha program would be better directed to our support or discussion channels (link here: https://firebase.google.com/support/)

jshcrowthe avatar May 30 '17 16:05 jshcrowthe

Adding some form of local storage in our apps to cache the data retrieved from the Firebase database (like most of us are doing, I guess) works quite well to enable offline use even when cold-starting an app, but the main problem I see with not having persistence built into the SDK like it is in the Android or iOS ones is this: when an app starts, the SDK has no idea what data is stored locally so, when it attaches listeners, the hash field is empty and the server responds with all the data. Every single time. That means that data usage with the JS SDK is significantly higher than with the native ones.

I understand building a reliable and truly cross-browser local storage solution into the Firebase SDK is no easy feat, maybe even impossible, but it doesn't need to be perfect. It just needs to be better than not having it, even if only in some situations. It could be implemented gradually, first for whatever browsers have the best IndexedDB or SQLite support and then slowly with others, if possible.

Another possible solution, albeit a radically different approach to what is being currently used in the other platforms, would be for the SDK user to pass whatever data it has for a certain database location before attaching a listener.

This might be better explained with an example: let's say we already know what the data at /messages contains, because we had it locally stored somehow:

let data = {
  "-Kgx5lyGUg7w9nnNAKss": {
    "from": "Bob",
    "text": "Hey there"
  },
  "-Kgx9en1kPcRyy1uk7j7": {
    "from": "Alice",
    "text": "Sup?"
  }
};

So there would be a way to "bootstrap" the data at that location before attaching a listener, letting the SDK know what we know:

firebase.database().ref('messages').bootstrap(data).on('child_added', snap => { /* */ });

This leaves some open questions, though: should the SDK always accept that data, or should it ignore it when it is positive the data it has is fresh? (maybe because there's already an active listener on that path).

This would only be a temporary solution anyway, since it puts most of the burden on the developer using the library (figuring out how to store the data locally, passing it to the SDK, etc.) Not my favorite approach but it would certainly be a step up.

jsayol avatar Jun 03 '17 20:06 jsayol

Some more thoughts: a possible solution to add local storage support would be to use localForage, maybe wrapping it like ionic-storage is doing. This would allow to use whatever the best solution is in every scenario/browser.

@knpwrs's idea of storage adapters also seems quite interesting. ~~Skimming through the code, it seems like there's already support for in-memory, LocalStorage, and SessionStorage. So it seems it would be a matter of implementing a way to allow the user to provide their own adapter with a compatible API. I would also consider this a temporary solution though, like the bootstrapping I mentioned in my previous comment.~~ [EDIT] I was looking in the wrong place, nevermind what I said here[/EDIT] The ultimate goal should be for the SDK to handle all this for the user, the same way it's done in the native SDKs.

jsayol avatar Jun 04 '17 07:06 jsayol

Having a storage mechanism like redux-persist would allow for complete browser/native coverage. Then the user would only have to specify an environment flag for the correct storage adapter to be used.

jthegedus avatar Jun 04 '17 09:06 jthegedus

@jsayol I've been looking at how PouchDB stores data offline. They've taken the approach of basing their storage around LevelUP. From that point you can plug in different backends such as MemDOWN (in-memory), level.js (IndexedDB), or even something like SQLdown (sqlite3, PostgreSQL, and MySQL). There's even an Abstract LevelDOWN project which can be used to implement compatible backends. Basing storage around LevelUP could be potentially interesting because then we inherit a large offering of various storage backends. By default we could use MemDOWN and offer the ability to use different backends such as level.js.

knpwrs avatar Jun 04 '17 16:06 knpwrs

I believe Firabase developers already have a solution. That's why @jshcrowthe suggests to sign up Firebase Alpha Program.

@jshcrowthe, @mbleigh, isn't it?

FluorescentHallucinogen avatar Jun 04 '17 18:06 FluorescentHallucinogen

@FluorescentHallucinogen, the Alpha Program is a great way to work with developers who are willing to donate their time to help make Firebase an awesome platform. There is really good discussion going on, so the invitation is to make sure we get to work with all of you in that space as well!

jshcrowthe avatar Jun 06 '17 21:06 jshcrowthe

AFAICT this discussion primarily emphasizes two things

  1. Providing offline access to data
  2. Leveraging offline data on boot to prevent unneeded network traffic

The notion of pluggable storage adapters is something that I think is a cool idea and I'd love to see a demo implementation of this in context of the SDK. This could be separate from minimizing the network traffic as the amount of network traffic would be no different than what it is today. Once we had an agreed upon implementation of persistence, reducing network overhead is just the next logical step.

In the iOS SDK (Github Repo: https://github.com/firebase/firebase-ios-sdk) we are synchronizing only the delta between the local device and server state. In principle we could port that same functionality over to web, and then integrate it with the persistence layer discussed above.

@jsayol / @knpwrs I'd love to see a sample implementation of the storage adapters concept, sounds like a solid strategy to allow for flexible browser/environment requirements.

jshcrowthe avatar Jun 07 '17 22:06 jshcrowthe

In the iOS SDK (Github Repo: https://github.com/firebase/firebase-ios-sdk) we are synchronizing only the delta between the local device and server state.

That's very interesting. You mean that if the hashes don't match when attaching a listener, only the difference is synchronized? If so that's pretty cool, and quite different from the web SDK where the whole thing is resent in that situation.

How is it implemented? Do you traverse the tree checking the hashes at every node to figure out what's up to date and what isn't? I'm trying to locate the relevant code in the iOS repo but I can't seem to find it (and not being familiar with ObjC doesn't help either :smile:).

jsayol avatar Jun 08 '17 07:06 jsayol

@jshcrowthe Time permitting I may be able to get something done. What do you think about utilizing LevelUP as suggested in my previous comment? Obviously assumes a compatible data model. If it's not compatible then we'd need to design our own adapters.

knpwrs avatar Jun 08 '17 11:06 knpwrs

@knpwrs I looked at LevelUP and it seems like a really solid library, however I don't know that we need all that it provides. With the database already being quite large, adding another large persistence library is probably a hard sell (I just ran LevelUP through a quick webpack build, 103kb min).

Same story goes for something like LocalForage (although this one is admittedly lighter coming in at around ~25kb).

IMO I'd start w/ just the raw primitives until we need the abstraction (we are going to have to build our own abstraction layer already to allow it to be pluggable).

jshcrowthe avatar Jun 08 '17 17:06 jshcrowthe

@jsayol so we currently are using a hash function that can be found here:

https://github.com/firebase/firebase-js-sdk/blob/master/src/database/js-client/core/Repo.js#L204-L236

This hash is a "simple" hash of the data in the node. We then send that hash to the server when we call listen in the PersistentConnection (see https://github.com/firebase/firebase-js-sdk/blob/master/src/database/js-client/core/PersistentConnection.js#L183)

By leveraging "compound" hashing (which is a hash of key ranges instead of the entire node, iOS implementation found here: https://github.com/firebase/firebase-ios-sdk/blob/master/Firebase/Database/Core/FCompoundHash.m) we could minimize traffic over the wire. We would just need to implement the ability to merge the range updates that we receive with what we already have in memory. (see https://github.com/firebase/firebase-ios-sdk/blob/master/Firebase/Database/Core/FRangeMerge.m)

All that said, I think the right first step is to allow for persistent offline through IndexedDB (or an adapter structure), and then work towards this.

jshcrowthe avatar Jun 08 '17 20:06 jshcrowthe

Thanks for the links @jshcrowthe!

I knew about the hash function (a few months ago it took me a while of digging through minified code to figure that one out :P) but I had no idea about the whole compound hashing implementation. I'll definitely look into it!

I agree with you though, none of it will be very useful without persistence so let's focus on that first. I think a solid first approach would be to simply use IndexedDB, since that would cover most use-cases. (Safari's implementation of IndexedDB is known to have issues though, so it might be worth looking into WebSQL too. Maybe. I don't know.) If we also want to support Node.js then we'd have to look into other options too, but having direct access to the file system opens a whole lot of other possibilities there.

You raised a valid point in a previous comment about bundle size. Ideally we'd keep this change as small as possible but if it ends up getting too large for comfort it could just be implemented into its own sub-module, as an optional feature to be added by the user if they want to use persistence. Something like this:

const firebase = require('firebase/app');
require('firebase/database');
require('firebase/db-persistence');

I'll start looking into how IndexedDB could fit in into the current implementation. Off the top of my head, we'd have to build a system to consistently synchronize the contents of the MemoryStorage with what's being persisted, and probably ensure we're not hitting persistence too often during read or write bursts to avoid performance issues. This synchronization could happen after a certain time of inactivity on the database, like for example 10 seconds, with a maximum interval of time between operations to minimize the risk of ending up with stale data in the event that the app would crash or suddenly be shut down somehow.

Thoughts?

P.S.: I still think the storage adapter idea is an interesting one that can be added later, but basic browser persistence should be provided by the SDK out of the box anyway.

jsayol avatar Jun 08 '17 21:06 jsayol

To me the best case scenario for true offline support would be if it was completely transparent. Downloaded data would be available from persistent storage, and new data would be written to persistent storage and synched automatically once the device is online again.

I'd like to add (since I haven't seen this mentioned) that using localStorage on mobile is not very persistent since it can be deleted at any time by the OS. This would be much more inconvenient on Cordova, React Native, NativeScript, apps. Users expect more persistence from an app than a website.

I agree that right approach would be having an API and write adapters on separate modules (official or third party) to reduce bloating on the main SDK.

PierBover avatar Jun 10 '17 03:06 PierBover

Side thought: how do the iOS and Android SDKs (and this SDK, for that matter) handle transactions when offline?

knpwrs avatar Jun 10 '17 18:06 knpwrs

@knpwrs That is a great question! Paging @schmidt-sebastian since I don't know off the top of my head.

jshcrowthe avatar Jun 14 '17 17:06 jshcrowthe

~~As far as I know, the iOS and Android SDKs will keep track and try to complete the transaction even across app restarts. The JavaScript SDK will only keep the transaction alive during the same session, since it doesn't persist the transaction state. This is definitely something that can be improved with this whole persistence "overhaul".~~

~~Other than that, they work the same way: when offline, the transaction callback will receive either the latest known value or null if it isn't known, will trigger optimistic updates unless instructed not to do so, and won't trigger the completion callback until the transaction is actually committed by going back online.~~

Edit: Oops, turns out I was wrong about transactions. See Frank's comment below for an explanation.

jsayol avatar Jun 14 '17 18:06 jsayol

Transactions are explicitly not persisted to disk. They do not survive app restarts.

This was an explicit decision by the team at the time. It might be good to revisit the discussion at some point. But for a first iteration, I'd recommend aiming for feature parity with iOS and Android and not persisting transactions.

puf avatar Jun 15 '17 14:06 puf

Ok, I've started looking into this. Some general considerations:

  • Persistence should be off by default and the user can enable it at will. Same behavior as in iOS & Android, AFAIK.
  • Both server and user-initiated writes should be persisted.
  • For user writes:
    • All set() and update() operations get persisted.
    • transaction() operations are only persisted once acknowledged by the server (committed). [@puf, when you said that transactions are not persisted in iOS/Android I assume you meant the intermediate states, right?]
    • Keep track of pending writes in case they fail, so that we can roll back persisted data.

Thoughts so far?

I'll start working on a simple proof of concept implementation to get the ball rolling. If I'm not mistaken the most obvious first "point of attack" seems to be the SyncTree so I'll focus on that. Feel free to offer any comments and suggestions, though :)

P.S.: I'll try to be as independent as possible while working on this to avoid bothering you all too much, but I might ask for some guidance from time to time. Hope that's ok!

jsayol avatar Jun 22 '17 17:06 jsayol

As far as I know the iOS and Android clients keep two types of data in their disk cache:

  1. Data that was recently listened to.
  2. A queue of all pending write operations, excluding transactions.

Note that (again: as far as I know) the pending writes are not aggregated into the data cache. That only happens when a listener updates it.

But I'd love @schmidt-sebastian or @mikelehen to give their take on this.

puf avatar Jun 23 '17 14:06 puf