cashay icon indicating copy to clipboard operation
cashay copied to clipboard

Cashay v2

Open mattkrick opened this issue 7 years ago • 18 comments

#149 gets us half way to a new mutations API, so let's dive in how that might change. I like Relay 2's direction of imperatively mutating the normalized store. I also like how they scrapped intersecting fat queries with tracked queries.

  • When I write a mutation, I want to explicitly write everything I want back from the server. Cashay does something magical by writing the intersection itself, but sometimes a mutation comes before a query. Cashay handles this by using @cached on a query so it'll return nothing until the mutation comes back, but sometimes it's nice to get state from the result of cashay.mutate and not have to get it from the redux-store. Doing this will shrink cashay by roughly 40%.
  • Optimistic updates: each mutation should take a function on how to adjust the normalized state optimistically. Take the getTop5Posts example, there are 4 ways that query can change:
    • The array of documents changes (A,B,C,D,E becomes A,B,C,D,F) optimistically
    • A document within the array gets updated (A.votes = A.votes +1) optimistically
    • The array of documents changes from the server
    • A document within the array gets updated from the server 1 and 3 should be discrete functions that are linked to the query operation (eg getTop5Posts) and passed into cashay to store.

//TODO

mattkrick avatar Dec 18 '16 18:12 mattkrick

So, let's say each mutation has 3 user-defined methods:

const upvoteMutation = {
  optimistic: (store, optimisticVars) => {
    return store.entity('Post').get(optimisticVars.id).update((doc) => ({
      updatedAt: new Date(),
      votes: doc('votes') + 1
    }));
  },
  resolve: (store, response) => {
    return store.entity('Post').get(response.id).update({
      votes: response.votes
    });
  },
updateResults: (store, doc) => {
    const currentResult = store.result('getTop5Posts').get({area});
    let minIdx;
    let minVal = Infinity;
    currentResult.forEach((res, idx) => {
      if (res('votes') < minVal) {
        minVal = res('votes');
        minIdx = idx;
      }
    });
    if (minVal < doc('votes') && minIdx !== undefined) {
      currentResult.add(doc);
      currentResult.remove(minIdx);
    }
  }
};

optimistic is called immediately after the optimisticVars is sent to the server. resolve is called with the response from the server. updateResults is called after both. The first 2 do nothing but update a single entity. updateResults targets each query that it could affect. It finds out if there is a post with fewer votes than the one that just got updated, and if so, it replaces it. The dependencies are checked and if nothing else depends on that document, it is removed from the state. The order doesn't matter because changing the array invalidates the query & that causes it to be re-run the query-specific sort/filter functions.

For example, let's say client A is listening to getTop5Posts/California and client B just upvoted post126. That upvote bumped it into the top 5! The message queue is going to kick down a message to client A that looks kinda like this:

{
  resultChanges: [
    {
      channel: 'getTop5Posts/california',
      removeDoc: 'post123',
      addedDoc: 'post126'
    }
  ],
  docsToUpsert: [
    {
      id: 'post126',
      upvotes: 28
    }
  ]
}

cashay will internally call some code like:

const [result, area] = channel.split('/');
const args = {area};
const doc = store.entity('Post').upsert({
  id: 'post126',
  upvotes: 28
});
const currentResult = store.result(result).get(args);
const idxToRemove = currentResult.find((res) => res('id') === 'post123');
currentResult.add(doc);
currentResult.remove(idxToRemove);

mattkrick avatar Dec 18 '16 20:12 mattkrick

These changes look really great—if I'm reading correctly, this will greatly simplify (eliminate?) mutationHandlers.

dustinfarris avatar Dec 19 '16 00:12 dustinfarris

@dustinfarris yup. this will largely reduce the scale & complexity of the whole mutations (and eventually subscriptions) API. What's still unclear is how to better handle pagination. I still like hiding the cursor from the front-end developer & just having cashay grab it behind the scenes, i'll need to give it more thought.

mattkrick avatar Dec 19 '16 03:12 mattkrick

Sorry, it's late on a Friday and this is a bit of a meandering post.

I'm sizing up the competition of apollo-client because I find the DX (mostly through react-apollo) rather clunky. Lokka and Cashay both have interesting ideas for making a dev's life simpler, but the apollo ecosystem is farther along in featureset, from what I can tell.

From what I can tell, everybody is shying away from the clever query intersecting because it's much easier to just ask for the whole query with a pre-arranged id instead. Am I correct in understanding that the cashay webpack loader is not necessary then?

I like that you can just call a mutation by name with variables, which 100% covers my use case and makes code much nicer. Calling two mutations in one mutation is just asking for trouble.

BTW, with react-apollo the mutation is defined and called (including any promise follow-ups) in the components, which I find grating, I'd much rather see this be done close to the Redux store actions and reducers.

In apollo, a mutation is called with extra configuration to provide an optimistic response, and there are 4 different ways to update the cache based on the mutation. You can tweak your own data, data of other active queries defined by the query name and the variables, or you can listen to all mutation responses on Redux and update your own state based on what other mutations got as results. You can also simply tell certain queries to refetch. None of those ways is always best.

Comments on your example:

  • When you optimistically add one to a vote, and accept the latest server response as correct, you will have problems when there are are multiple votes in flight. Suppose you add 5 votes to a thing, then the optimistic result will first show 5 added and then drop back to 2, 3, 4, 5 added as the results come in. Instead, you should keep track of the optimistic result for each mutation and only change the state if the server version differs from what was expected. I've been told Apollo does this.
  • In the update, you miss updating the timestamp
  • the top 5 post thing assumes per-one updates but the server can have more-than-one vote updates

Sorry for the rambling, just spouting thoughts :)

PS: Apollo does store data in denormalized form (too denormalized, I have a couple thousand keys from anonymous arrays, it slows down the devtools enormously), you might want to update your comparison table. You can also sort/filter responses via the props() function on a query, but that is per-component, the original data is what goes in the store.

wmertens avatar Feb 10 '17 22:02 wmertens

but the apollo ecosystem is farther along in featureset

if freakin better be, $31MM of VC funding is depending on it 😂

that's very interesting about them denormalizing queries, i remembered when they were first getting started we had a little chat & that was one of the things they were adamantly against. I'm assuming they figured out that creating a whole bunch of new objects every single time a dispatch is called isn't the best use of CPU cycles...

I have a couple thousand keys from anonymous arrays, it slows down the devtools enormously

I guess me & apollo still disagree on the definition of state. To me, state is something you can't calculate (at least not cheaply). That keeps redux devtools readable & means I can persist my entire state to localStorage without too much fanfare.

Suppose you add 5 votes to a thing, then the optimistic result will first show 5 added and then drop back to 2, 3, 4, 5 added as the results come in. Instead, you should keep track of the optimistic result for each mutation and only change the state if the server version differs from what was expected. I've been told Apollo does this.

oh NOW we're getting into the fun stuff! Premise: post.value = 1 i click "vote" twice. So, state S0 = 1. S1 = 2 S2 = 3

If I understand you correctly, you're saying when S1' == 2 (server response), then only invalidate if S1' !== S1 and reapply the chain of operations to S2. Now this is sound logic. Similar to what i did in redux-optimistic-ui. Honestly though, I probably won't build this in. I'd much prefer to let the user pass in an invalidate = (source, doc) => doc.votes !== source, doc). then i don't have to store an array of cloned states, nor do i need to perform an operation on the state before i do my comparison. reasonable default for basics (cuz inflight latency is rarely an issue), but then allow the dev to make something super powerful & performant by passing in an extra option.

mattkrick avatar Feb 11 '17 01:02 mattkrick

On Sat, Feb 11, 2017 at 2:13 AM Matt Krick [email protected] wrote:

but the apollo ecosystem is farther along in featureset

if freakin better be, $31MM of VC funding is depending on it 😂

that's very interesting about them denormalizing queries, i remembered when they were first getting started we had a little chat & that was one of the things they were adamantly against. I'm assuming they figured out that creating a whole bunch of new objects every single time a dispatch is called isn't the best use of CPU cycles...

Oh wow didn't know about the VC!

I have an app that shows a table with 20 rows, each having a cell that does a graphql query with an id, basically returning an array with maybe 40 small objects. Getting the data from cache takes 300ms :(

I have a couple thousand keys from anonymous arrays, it slows down the devtools enormously

I guess me & apollo still disagree on the definition of state. To me, state is something you can't calculate (at least not cheaply). That keeps redux devtools readable & means I can persist my entire state to localStorage without too much fanfare.

Hmmm, I think the problem is that they break out every object in the response and store it separately in a single huge object, using either the id if present or the array index as a key. The redux and apollo devtools fall over when looking at the store.

It seems to me that doing this is only useful for things you can reference by unique id, and therefore arrays of strings should not be denormalized. You can even read the schema beforehand to figure out which objects should be stored separately. (Handwave programming, obviously…)

oh NOW we're getting into the fun stuff! Premise: post.value = 1 i click "vote" twice. So, state S0 = 1. S1 = 2 S2 = 3

If I understand you correctly, you're saying when S1' == 2 (server response), then only invalidate if S1' !== S1 and reapply the chain of operations to S2. Now this is sound logic. Similar to what i did in redux-optimistic-ui. Honestly though, I probably won't build this in. I'd much prefer to let the user pass in an invalidate = (source, doc) => doc.votes !== source, doc). then i don't have to store an array of cloned states, nor do i need to perform an operation on the state before i do my comparison. reasonable default for basics (cuz inflight latency is rarely an issue), but then allow the dev to make something super powerful & performant by passing in an extra option.

Hmmm, not entirely sure if I follow. To do optimistic responses, basically you do a mutation and provide the answer that you think will be returned by the server. That answer is used in anything requesting that particular object, and obviously anything using that answer has to be prepared for a sudden change of state.

Then, when the answer comes back, you can check if the answer is what you expected. If it's the same, no action is needed, and if it differs, you simply use it. Any pending optimistic answers for that/those objects need to be invalidated/treated as if they are not optimistic, reverting to standard staggering behavior.

No state clones needed?

wmertens avatar Feb 12 '17 22:02 wmertens

let's stick with the voting example, assume reddit rules where you can only upvote something once.

so, you click it twice. Votes go from 0 to 2. The first request gets sent via http & gets redirected to a slow server the second request gets sent via http & goes to a fast server

the response for req2 comes back first successful, but the response for req1 comes back a failure. how do you achieve the proper state?

you need at least 1 state clone because the result of req2 may be different if req1 failed (and it is-- the state should be 1, not 2). So, you copy state at the time of the first optimistic request, then you store a history of all operations. when a failed operation comes in, you go back to the first copied state & reply the actions disregarding the fail. if it comes bak a success, you mark the optimistic ones as successes & can update the copied state to the point of the earliest inflight.

code is easier to grok than me. check out https://github.com/mattkrick/redux-optimistic-ui. it's like 100 well-commented lines.

mattkrick avatar Feb 12 '17 23:02 mattkrick

On Mon, Feb 13, 2017 at 12:21 AM Matt Krick [email protected] wrote:

let's stick with the voting example, assume reddit rules where you can only upvote something once.

You're assuming that the client code is not aware of server side rules? As a programmer, if you're not very sure that the optimistic result will be the correct one, you shouldn't do optimistic results to begin with…

so, you click it twice. Votes go from 0 to 2. The first request gets sent via http & gets redirected to a slow server the second request gets sent via http & goes to a fast server

the response for req2 comes back first successful, but the response for req1 comes back a failure. how do you achieve the proper state?

Hmmm, here you're mixing actual server errors with denied server actions, or not?

Any response will return the actual vote count, and so whichever result has the latest timestamp should be the one the user sees. If there is a server error, I suppose the best thing to do is to re-fetch the original query, to see what the actual state is.

you need at least 1 state clone because the result of req2 may be different if req1 failed (and it is-- the state should be 1, not 2). So, you copy state at the time of the first optimistic request, then you store a history of all operations. when a failed operation comes in, you go back to the first copied state & reply the actions disregarding the fail. if it comes bak a success, you mark the optimistic ones as successes & can update the copied state to the point of the earliest inflight.

I think we are assuming different operating models. For me, graphql data is canonical. So given a graphql state and "local app state", the local app state should not be a result of the history of the graphql state, but only the actual state. In fact, in my apps so far, the local app state is orthogonal to the graphql state.

What you are describing sounds like Operational Transform and while that has its place, I think it should not be part of a generic graphql client. I'd say that if you want that, the graphql state is the full history of all the operations approved by the server, and the client should at all times be prepared to apply OT to the local non-confirmed operations based on incoming server operations. This is ShareJS' domain.

code is easier to grok than me. check out https://github.com/mattkrick/redux-optimistic-ui. it's like 100 well-commented lines.

Bedtime but will do tomorrow :)

wmertens avatar Feb 12 '17 23:02 wmertens

code is easier to grok than me. check out https://github.com/mattkrick/redux-optimistic-ui. it's like 100 well-commented lines.

Ok, so indeed, this implements sort-of-OT. The thing is, server actions aren't like local actions. They are asynchronous, can fail in various layers, generally have side effects and cannot be controlled by the redux store.

So I like treat graphql mutations as messages-of-intent and all graphql state as "not mine". I somewhat dislike that react-apollo pressures you into defining mutations together with the component, but on the other hand, mutations are sort of declarative and shouldn't impact local state much. Still on the fence, especially since moving the mutation definition elsewhere is so much work, and many of the additional features are closely tied to the component (fetching more data, merging to cache, optimistic result).

…and under that model, I think just storing the expected server response should be sufficient, no?

wmertens avatar Feb 13 '17 08:02 wmertens

mutations should not be declarative, this is the error that cashay, relay, and apollo all made. to be declarative means mutating the denormalized state. Since 1 mutation affects many queries, you'll need each query to handle that mutation. If a query doesn't handle the mutation, then the denormalized state will be stale. Sure, it's simpler for the dev, but the surface area for errors increases from S(1) to S(n).

storing the expected server response should be sufficient

mathematically, no. remember the old redux motto: state = fn(state, action). it's a fun enough proof. try the above example i outlined (2 upvotes, the second comes back a success, then the first comes back a failure) without persisting S0.

mattkrick avatar Feb 13 '17 18:02 mattkrick

mutations should not be declarative, this is the error that cashay, relay, and apollo all made. to be declarative means mutating the denormalized state.

Ok, I'm not sure how that follows, and maybe we are talking about denormalization differently? And when you say state, do you mean the entire state or just the graphql cache?

For clarity - to me, the way react-apollo works is like React bound inputs. I have an application structure and in some places I ask for server data, which is automatically fetched and provided (on data instead of value). Mutations are like actions - you dispatch them to the server and eventually that results in new data props.

When you specify a mutation declaratively, including how to synchronize the local cache afterwards, you can simply call it with the right parameters and forget about it.

Of course, it is not always trivial to declare how to update the local cache, but worst case you can force refetching everything. Updating local cache is an optimization.

Since 1 mutation affects many queries, you'll need each query to handle that mutation. If a query doesn't handle the mutation, then the denormalized state will be stale. Sure, it's simpler for the dev, but the surface area for errors increases from S(1) to S(n).

So, in react-apollo, if you get a result back with a UUID, that result will automatically update the cache and all components that are asking for that data. In other cases, you can specify how to update specific queries (by name, in the mutation def) or you can specify in queries how they update based on a specific mutation (by name, in the query def). The current API is a bit vexing but that's the idea.

storing the expected server response should be sufficient mathematically, no. remember the old redux motto: state = fn(state, action). it's a fun enough proof. try the above example i outlined (2 upvotes, the second comes back a success, then the first comes back a failure) without persisting S0.

But this is a different world view. For Apollo, the server state/local cache is not (conceptually) part of the Redux store. It requires you to program your reducers such that for any graphql action/event A (sendQuery/sendMutation/receiveResult) fn(state, A) === state.

So, if this holds true, then you don't even notify Redux about Apollo events, and instead all application changes due to graphql in/out should happen by locally deriving from data and Redux state.

This means that you can inject any graphql state and the app will render correctly; it also means that optimistic responses can simply be replaced with actual server responses at any time :)

wmertens avatar Feb 15 '17 04:02 wmertens

But this is a different world view. For Apollo, the server state/local cache is not (conceptually) part of the Redux store. It requires you to program your reducers such that for any graphql action/event A (sendQuery/sendMutation/receiveResult) fn(state, A) === state.

Can you give an example of this? I still don't understand

mattkrick avatar Feb 15 '17 17:02 mattkrick

So the way I understand Apollo to work, is that to get the data, you attach a query to a component, and it will get the data as props. If the data changes, the component gets new props. If you want to mutate data, you write a message to the server with the proper variables, and you can ask for data back. Apollo's job is to make the connection easy and efficient.

To perform its job, Apollo uses a local cache and local updating of cache based on mutation results. The dev needs to provide extra information to help Apollo do its job in some cases (updating lists after a delete etc)

So, at all times, the data you get from graphql can change, and you have no control over the execution of mutations. In other words, this is external state.

If you use that external state as input for your reducers (by passing the graphql data to them or by sending actions based on data content), you run the risk of caching stale external state.

Instead, if you write the reducers so that any server data is orthogonal to the application state, you can present fake external state without consequences.

This means that to implement optimistic updates in Apollo, you make a guess about server state, and you keep that guess going as long as possible. As soon as you learn that the server state is not what you think it is, you present the correct state and the application should gracefully handle that.

wmertens avatar Feb 16 '17 13:02 wmertens

How does that work with your 5 in-flight upvote example without an intermediate state?

On Feb 16, 2017 5:30 AM, "Wout Mertens" [email protected] wrote:

So the way I understand Apollo to work, is that to get the data, you attach a query to a component, and it will get the data as props. If the data changes, the component gets new props. If you want to mutate data, you write a message to the server with the proper variables, and you can ask for data back. Apollo's job is to make the connection easy and efficient.

To perform its job, Apollo uses a local cache and local updating of cache based on mutation results. The dev needs to provide extra information to help Apollo do its job in some cases (updating lists after a delete etc)

So, at all times, the data you get from graphql can change, and you have no control over the execution of mutations. In other words, this is external state.

If you use that external state as input for your reducers (by passing the graphql data to them or by sending actions based on data content), you run the risk of caching stale external state.

Instead, if you write the reducers so that any server data is orthogonal to the application state, you can present fake external state without consequences.

This means that to implement optimistic updates in Apollo, you make a guess about server state, and you keep that guess going as long as possible. As soon as you learn that the server state is not what you think it is, you present the correct state and the application should gracefully handle that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mattkrick/cashay/issues/150#issuecomment-280330515, or mute the thread https://github.com/notifications/unsubscribe-auth/AFQjvxdHbmrfs7621emBnPQLI7GUTlrdks5rdE9pgaJpZM4LQMlq .

mattkrick avatar Feb 16 '17 15:02 mattkrick

The only intermediate state is the current cache of the server data. Local app state should not be dependent on (stale) server data so there is no direct local state change on mutation returns.

If we assume that we can correctly update (or invalidate) the local cache based on the mutation return values (and their optimistic versions), we can always provide subscribed components with some optimistic or real data. This assumption implies that the cache update functions should not only be idempotent, but the return value of the mutation should be sufficient to update the cache.

So if the mutation responds with "ok, you upvoted", you don't have enough information to correctly determine the server state. If the response is "there are x votes", you can.

So for the 5 upvotes, you start at 0 votes, send an upvote and expect 1 votes back, send an upvote and expect 2 votes back, etc. You process the optimistic result as the server result, and wait for the server response. If the server response matches, you don't do anything, if it differs, you process the server result.

So, the requirements are:

  • Use server state only for deriving view data combined with local state
  • Post-mutation cache updaters must reflect latest server state no matter which intermediate results were there, or invalidate.
  • Mutations must send all data necessary to know latest server state

A further refinement: provide the updaters with the server response and the optimistic response. Then you don't have to ask for data you know. E.g. if you are posting comments, the server response might be the comment count and the id of the comment, but not the comment text, so the optimistic response could be {count++, lastId+1, commentText}, and on return you simply set the count to the server count and cache the comment under the correct id.

On Thu, Feb 16, 2017 at 4:36 PM Matt Krick [email protected] wrote:

How does that work with your 5 in-flight upvote example without an intermediate state?

On Feb 16, 2017 5:30 AM, "Wout Mertens" [email protected] wrote:

So the way I understand Apollo to work, is that to get the data, you attach a query to a component, and it will get the data as props. If the data changes, the component gets new props. If you want to mutate data, you write a message to the server with the proper variables, and you can ask for data back. Apollo's job is to make the connection easy and efficient.

To perform its job, Apollo uses a local cache and local updating of cache based on mutation results. The dev needs to provide extra information to help Apollo do its job in some cases (updating lists after a delete etc)

So, at all times, the data you get from graphql can change, and you have no control over the execution of mutations. In other words, this is external state.

If you use that external state as input for your reducers (by passing the graphql data to them or by sending actions based on data content), you run the risk of caching stale external state.

Instead, if you write the reducers so that any server data is orthogonal to the application state, you can present fake external state without consequences.

This means that to implement optimistic updates in Apollo, you make a guess about server state, and you keep that guess going as long as possible. As soon as you learn that the server state is not what you think it is, you present the correct state and the application should gracefully handle that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mattkrick/cashay/issues/150#issuecomment-280330515, or mute the thread < https://github.com/notifications/unsubscribe-auth/AFQjvxdHbmrfs7621emBnPQLI7GUTlrdks5rdE9pgaJpZM4LQMlq

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mattkrick/cashay/issues/150#issuecomment-280364714, or mute the thread https://github.com/notifications/unsubscribe-auth/AADWlonnnNsC0aIFWEClzGlLEUHZh1Zxks5rdG0HgaJpZM4LQMlq .

wmertens avatar Feb 16 '17 19:02 wmertens

Something to think about in free time:

  • When removing an item from a subscription, sometimes we want to remove that entity (say when a user gets deleted) and other times we want to leave it there (say when a user gets removed from the Top 5 list of users).

The dependency to decide whether or not to remove it is not the component/key, rather the source (query or subscription). That means we'd need to save a list of sources on each entity & track that, which is bleh.

mattkrick avatar Mar 21 '17 20:03 mattkrick

If you reason that at any one point there will only be a smallish number of cached queries (say <500), and most cacheable objects will only be involved with a small number of them, you can keep a WeakSet with each query that says if an object is used in that query.

When an object is removed, you can check all the WeakSets and only walk the caches that have the object.

wmertens avatar Mar 22 '17 06:03 wmertens

(that is, if you are using shared singular mutable objects as cache basis like I describe at https://github.com/apollographql/apollo-client/issues/1300#issuecomment-283169452)

wmertens avatar Mar 22 '17 06:03 wmertens