cashay icon indicating copy to clipboard operation
cashay copied to clipboard

RFC: return cached data for a known type/id

Open dustinfarris opened this issue 8 years ago • 13 comments

UPDATED 1/3/17

The prefetch hook described below is not in-line with the desired style of Cashay. See discussion and tl;dr in comments.

DISCLAIMER

I am still learning about cashay implementation details and it's possible one or more of my assumptions are wrong. Please push back.

Summary

Add a hook to allow users to access the cache while a query is in-flight.

Reason

When the user requests data, it is possible that some or all of the data already exists in cashay's cache. It would be beneficial from a performance standpoint to utilize that cache.

Note: the goal is not to reduce network traffic, but to display the data we already have, if we have it.

Cashay does this already under certain circumstances:

  • when it recognizes the Operation/shape of the query
  • when the user adds the @cached decorator

There is an additional scenario, however, not covered by these—when the user is requesting a GraphQL field subset for which partial data already exists in the cache.

Consider this query from a project detail page:

query {
  project(id: $project_id) {
    id
    name
    todos {
      id
      name
    }
  }
}

The project detail gets all todos under it, selecting their individual ids and names. Now suppose the user clicks a specific todo, and is taken to a todo detail page which might have the following query:

query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}

Cashay sees this as a totally new query, yet some of the data is already in the cache from the project page! Namely, the todo's id and name.

The @cached decorator cannot be used here for two reasons:

  1. @cached assumes the entire shape can be found in cache and kicks back null for the pieces it doesn't find
  2. If the user enters the todo detail route directly, the cache does not exist yet at all. @cached provides no fallback behavior if an id is not found.

Implementation

Add a new hook for cashay.query that gives the user access to the cache. In my mind, this will behave similar to the way mutationHandlers and "optimisticVariables" work currently, with the following flow:

  1. user executes a query
  2. cashay prepares an initial result as it currently does
  3. cashay executes a "preFetch" hook, providing the initial result and access to the cache
  4. user modifies the initial result using the cache
  5. cashay continues to execute a server query in the usual way
  6. (bonus) cashay executes a "postFetch" hook and returns the final result

Pseudo (using the todo query above as todoQuery):

const todoQuery = `
query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}
`;

const stateToComputed = (state, { todo_id }) => {

  const { data: { todo }, status } = cashay.query(todoQuery, {

    variables: { todo_id },

    preFetch(initialResult, cache) {
      const cached_todo = cache.find('Todo', todo_id);
      if (cached_todo) {
        initialResult.todo = cached_todo;
        return initialResult;
      }
    },

    postFetch(finalResult, serverResponse) {
      // possible manipulation here, not strictly necessary
    }

  });

  return { todo, isLoading: status === 'loading' };
}

const TodoDetail = Component.extend({
  layout: hbs`

  <h1>{{todo.name}}</h1>
  <hr>
  {{#if isLoading}}
    <em>Loading...</em>
  {{else}}
    <div>Description: {{todo.description}}</div>
    <div>Assignee: {{todo.assignee.name}}</div>
  {{/if}}
  
  `
});

With such an API, the component will be able to display the name of the todo, while the rest of the content is loading.

Drawbacks

This opens the door for possible cache misuse and anti-pattern.

Alternatives

Modify @cached instead

Potentially, @cached could be modified to implement the same flow above, where it returns cached data, but fires a server request anyway if the cached data is not found. This would need to account for missing subfields—e.g. if the entire requested shape does not exist in cache, the server must be consulted.


additional context in #154

dustinfarris avatar Dec 30 '16 23:12 dustinfarris

Updated title and description. /cc @mattkrick @jordanh

dustinfarris avatar Dec 31 '16 00:12 dustinfarris

Intriguing proposal @dustinfarris. A suggested name for such a feature "progressive querying." I'll let met respond with his thoughts before adding mine...

jordanh avatar Jan 02 '17 18:01 jordanh

I think the goal of cashay is to have a declarative API (except for mutations, but that's for v2). In other words, you should tell it what you want to do, not how to do it. Declaratively, you want data. You shouldn't have to sorry how that's achieved. Unfortunately, to do that means we'd have to make certain assumptions about the user defined queries on the server (like how they're named), or we'd have to be more verbose.

mattkrick avatar Jan 02 '17 21:01 mattkrick

@mattkrick ok, if adding lowish-level hooks are a no-go (à la v2), then I think my alternative proposal should be considered, perhaps in a separate issue.

In a nutshell: modify existing @cached to be smarter about existing data and allow the user to specify fallback behavior in case some or all of the data is missing.

Something like:

query {
  todo @cached(id: $todo_id, type: "Todo", refresh: true) {
    id
    name
    assignee {
      id
      name
    }
  }
}

:point_up: adding "refresh: true" instructs Cashay to "give me whatever you have for the resolved id, but kick off a server request and update the information". To your point, @mattkrick, this assumes that the name of the field, todo in this case, matches the name of the server-side field to be queried.

From my understanding, current @cached allows you to name the base field whatever you want, e.g.

getTodoFromCache @cached (id: $todo_id, type: "Todo") {

so it would be necessary to either make this illegal (doubt anyone would go for that), or require the user to specify the server query field explicitly if it is not the same, e.g.

getTodoFromCacheAndServer @cached (id: $todo_id, type: "Todo", refresh: true, field: "todo") {

in the absence of field:, it is assumed that the field name is the same

todo @cached (id: $todo_id, type: "Todo", refresh: true) {

dustinfarris avatar Jan 03 '17 02:01 dustinfarris

if we know the name of the query, then do we even need to use a directive? Why not just make it a standard query & then diff out the fields that we already have? If we have all the fields, then we don't send a request to the server.

mattkrick avatar Jan 03 '17 02:01 mattkrick

Yep, that works too.

The goal, though, is to use data that potentially came from other queries—you mentioned in another issue something about not making assumptions about type/id uniqueness. I'm still fuzzy about what you meant by that.

Yes, ideally I would love for Cashay to "just do it" when I write,

query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}

:point_up:

  • look up todo on client-safe schema
  • note that it returns type "Todo", use that to search cache for Todo with todo_id
  • return any requested subfields it already has in cache
  • kick off a server request for either the whole thing (full replacement) or just the subfields that it didn't have—not sure best strategy but it's a win for me regardless

dustinfarris avatar Jan 03 '17 02:01 dustinfarris

here are the embedded assumptions in that:

  • calling todo with the same args will return the same result
  • calling todo will return a document
  • calling todo will return a document with an ID that is unique among all todos

These assumptions are all probably safe, heck relay makes these assumptions & that's why you write relay queries in such a funny way... the relay server basically has an id that is a union of all possible types. Their GUID is a base64 composite key of the id & the type, so you don't even need to repeatedly write a bunch of getXById queries, the helper on the server writes those for you. Unfortunately, this assumes you have access to change the server. That server could be set up by another person/department/company, which means you don't have the right to change it.

mattkrick avatar Jan 03 '17 02:01 mattkrick

The difference here is that Cashay knows in advance what Type will be returned. Relay and others do not. In my view, this is a clear differentiator that makes those assumptions plausible, without any goofy GUIDs. Server access not required, only the schema.

dustinfarris avatar Jan 03 '17 03:01 dustinfarris

definitely, i hear ya, it's a solid proposal. what would you propose to do about arrays of todos? For example let's say on 1 page you have a getTop50Todos. Then, on another page you wanna query 5 of those. would you have 1 query that takes in an array of ids or would you call the same todo 5 times?

mattkrick avatar Jan 03 '17 03:01 mattkrick

Interesting question. My gut reaction is the (optional) list of ids.

type Root {
  todos(ids: [ID]): [Todo]
}
query FiveTodos($todo_ids: [ID]) {
  todos(ids: $todo_ids) {
    id
    name
    description
  }
}

Aside: There are other caveats such as interfaces and unions where it may not be easy to determine to exact Type. I'd say the hard-and-fast rule should be: "we will try, but if we can't resolve the Type and id, we're going to the server". We can potentially add more tools here in the future, but I think sticking to known type+id is a good starting point.

dustinfarris avatar Jan 03 '17 03:01 dustinfarris

"other tools" might include a way to specify which Types have UUIDs when cashay is initialized.

e.g.

cashay.create({
  types_with_uuid: ["Todo", "Project"]

and then use that to decide whether uniqueness can be inferred from a particular Union query. If so, great. If not, existing operation/key in cache? Ok, still great. Otherwise, server.

But I digress...

dustinfarris avatar Jan 03 '17 03:01 dustinfarris

@cached already handles unions, although i'm not sure how well...works for me at least :smile:

mattkrick avatar Jan 03 '17 03:01 mattkrick

tl;dr the abbreviated proposal is:


If the Object Type and ID of a query can be determined in advance, Cashay will attempt to find existing data in cache, and backfill missing subfields with a server request.


dustinfarris avatar Jan 03 '17 03:01 dustinfarris