urql RFC: Increase feature-set of the populateExchange for progressive Graphcache adoption

Summary

The populateExchange is an effective tool to progressively start using Graphcache more effectively. The current problem with adopting Graphcache is that the user has to at some point make the switch from the Document Cache. The interesting properties of the @populate directive and Graphcache combined mean however that the populateExchange can be used to progressively use Graphcache like the document cache.

In such a scenario the goal would be to allow the populateExchange to trigger refetches like the Document Cache does without the user having to write updaters for Graphcache that invalidate entities.

query TodoList {
  todos(first: 10) {
    id
    title
  }
}

mutation AddTodo {
  addTodo(title: "Test") {
    todo @populate
  }
}

Given this mutation to add a Todo it may be common to define a cacheExchange.updates.Mutation.addTodo function that adds the new Todo item to several lists. In this example the mutation returns a Todo and @populate adds the appropriate fields.

If the mutation also returns a viewer field then @populate may be used on that!

mutation {
  addTodo(title: "Test") {
    viewer @populate
  }
}

This mutation may now update Query which viewer: Query! links to by adding the appropriate fields. Currently the populateExchange does not support field arguments, but if we add those then this use-case is covered! The mutation effectively would be allowed to update any field related to Query.

In cases where no viewer field exists on the schema, we could allow a novel use of the @populate directive on the mutation operation itself!

mutation @populate {
  addTodo(title: "Test")
}

We could allow the populateExchange to send an entirely new query operation in this case which is generated from fields that the populateExchange knows are affected by addTodo (i.e. each path from Query to Todo)

Generally it'd be useful for @populate to only add affected fields when necessary, meaning when any types under @populate loop back to Query, only fields on Query and below should be included until the path reaches the types affected by the mutation. Once these types are reached, the usual @populate logic applies.

Once we allow this, all data may be updated automatically. We can then let the user transition to more Graphcache-based usage by letting them move the @populate directive to lower fields.

Proposed Changes

Allow @populate to add and track fields that have arguments, e.g. timestamp(format: UTC) or todos(first: 10)
Allow @populate to be added to the whole mutation operation to "populate" a separate dynamic query to be sent to update all data related to the affected mutation types
Filter @populate fields on any Query (root query type) type to only include paths leading to types affected by the mutation. Once this type is reached the usual populating logic applies, i.e. we include all known fields again.

NOTE: The good thing about these proposed changes is that they're all additive! None of them immediately require a rewrite, although the addition that's laid out in "Open Problems" may require a small data structure change, which we'd want to do for the paths from Query to other types anyway, I'd imagine.

Requirements

I'll lay out more requirements for the third proposed change. This change is crucial to never update all fields on Query. If we allow @populate to add fields for all fields touched by the app on Query then eventually it'll fetch all data the app has ever seen, which is a huge amount of data.

Instead, we can be smart about filtering, like the following:

Starting from Query (either due to @populate on the mutation operation which creates a new query, or viewer @populate which leads to `Query)
Take all types that the mutation methods affect, e.g. in the example above Todo.
Add fields to Query that'll eventually retrieve these types recursively.
Once the type is reached on a path (e.g. Todo), apply the normal populating logic (all fields)
- Exception (See "Open Problems") filter fields on Todo and below by which fields are actually currently in use

Open Problems

There's one problem that we'll need to address first. Given a query for a single todo:

query Todo {
  todo(id: 10) {
    title
    owners(first: 10) {
      name
    }
  }
}

We wouldn't want all Todos to now have owners fields. Given Viewer -> populate fields leading to... -> Todo -> populate all fields we'd do that; all todos would now be queried with the owners field. So is there a heuristic that also stops these fields from always being populated eagerly?

I'd propose, in this case, any field below Todo would only receive fields that are currently in use in the app. This way, if schema-awareness is used especially, we'd have all necessary data to render or re-render, but Graphcache can re-send certain queries to get the data again as needed!

Sep 07 '20 16:09 kitten

Implementation Plan

Only one data structure is needed to track all fields:

interface Field {
  activeOperations: number,
  parentFields: Set<Field>,
  returnTypeName?: string,
  fieldName: string,
  arguments: any, // NOTE: In the POC we'd just readd them from scratch, YOLO
}

// typename => fieldKey (fieldName + arguments) => Field[]
type Fields = Record<string, Record<string, Field[]>;

Fragments won't be tracked anymore. It's assumed that only missing fields from a given selection set are added and that it doesn't matter how they're added, since the API does not care.

Case: Mutation that alters Todo,

Todo upwards, using parentFields, until Root Type / Query is reached
Todo downwards, using Type traversal, until Leaf nodes are reached

Case: Reading to Fields with an interface type

Unwrap types to get interfaces
Read all fields that are used from interfaces
Read all fields from concrete type that isn't in interface

Sep 08 '20 14:09 kitten

I'm late to this!

We wouldn't want all Todos to now have owners fields. Given Viewer -> populate fields leading to... -> Todo -> populate all fields we'd do that; all todos would now be queried with the owners field. So is there a heuristic that also stops these fields from always being populated eagerly?

With this point, I think it would be best to always include the full traversed fragment.

That in itself is the magic of the populate exchange - by introducing more conditional logic about what we do/don't include, we lose that simplicity and in those cases it's arguably easier then to not use the exchange and just specify the required fields manually.

If we do however want to do something like that, making it opt-in might be a way to go

# Excuse the bad syntax
mutation SomeMutation(id) @populate(shallow: true)

# `@populate(shallow: true) only includes fields which return primitive types

Jan 04 '21 10:01 andyrichardson

After https://github.com/urql-graphql/urql/pull/2897 we have started working towards this however there are a few issues we'll have to solve

[ ] add aliases for duplicate fields with different arguments
[x] find solution to the fields-graph growing endlessly (#3023)

And we are missing support for

the viewer field
re-fetching through populating relevant queries

Jan 12 '23 13:01 JoviDeCroock