federation icon indicating copy to clipboard operation
federation copied to clipboard

__resolveReference within the same service

Open Otard95 opened this issue 2 years ago • 3 comments

The feature

The idea is simple and the title pretty much explains it.
We (me and my colleges, and hopefully the community) would love to see the __resolveReference be able to rehydrate its parent type, even if the reference comes from the same service.

Our packages

Our situation

This would of course be useful in many parts of our GraphQL layer. But I'll use our 'geo'(geo-location) sub-graph as an example. I'll jot down some of our types so I can refer to them in my examples:

"""
Definition: [Wikipedia](https://en.wikipedia.org/wiki/Administrative_division)
"""
type AdministrativeDivision implements Node @key(fields: "id") {
  id: ID!
  country: Country!
  # ... some info on this node
  parentDivision: AdministrativeDivision
  childDivisions: [AdministrativeDivision!]
  postCodes: [PostalCode!]
}

type Country implements Node @key(fields: "id") {
  id: ID!
  # ... some info on this node
  """
  All the highest level administrative divisions in the country
  """
  administrativeDivisions: [AdministrativeDivision!]!
}

type PostCode implements Node @key(fields: "id") {
  id: ID!
  # ... some info on this node
  country: Country!
  """
  The parent administrative divisions in which this post code resides.
  """
  administrativeDivisions: [AdministrativeDivision!]!
}

Response Cache

Because this is the type of data that doesn't really change much over time, caching is very important for us. Using apollo-server-plugin-response-cache would not be optimal for us as you would end up with a lot of duplicate data cached. As for example a query to get PostCode's given either id or the postal code itself, assuming you also ask for its AdministrativeDivision's, would end up caching lots of duplicates of AdministrativeDivision's. There will be lots of different post codes in the same administrative division. With apollo-server-plugin-response-cache only identical requests will utilize the cached data which will include the administrative divisions.
Example:

query GetPostCodeById {
  getPostCode(id: 123) {
    id
    administrativeDivisions {
      id
      # other fields
    }
  }
}

and

query GetPostCodeById {
  getPostCode(id: 456) {
    id
    administrativeDivisions {
      id
      # other fields
    }
  }
}

would be two different cache entries, and if they both have the same administrative divisions we now have duplicates. Because of this we cannot use the cached AdministrativeDivision's because they are coupled to a specific post code.

Caching on data fetching layer

Caching on the data fetching layer, in our case RESTDataSource from apollo-datasource-rest, would mitigate some of this duplication, but not entirely eliminate it. It also has its own issues.

In the case of duplication, we have multiple APIs that provide data for the same type. Like for example AdministrativeDivision.childDivisions that uses api/administrative/divisions/children and for divisions by id like AdministrativeDivision.parentDivision that uses api/administrative/divisions.

As mentioned it also has its own issues. Primarily that the overwritable method cacheKeyFor has no obvious/good way of utilizing the cacheHint or provide similar options.

How this feature would solve such issues

The solution we wanted to go with is a wrapper for our resolvers that would be used to determine the cache key based on the returned nodes' __typename and id. Or in the cases where that is not possible with a cache key based on the parent __typename, id, the resolved field, and any arguments.

This would allow for example the following query

getAdministrativeDivisions(id: ID!): AdministrativeDivision!

to cache the normalized result with only a reference to its parent, instead of the entire object. This way our cache on the key <__typename>.<id> would only hold the data related to that object and any references it may have. Using the same wrapper on __resolveReference it would then be able to resolve that cached reference and cache its own object with references.
Which also matches up perfectly with the API for getting administrative divisions as it only returns an id for the parent.

In the case of AdministrativeDivision.childDivisions, the API for getting administrative divisions returns no references ids for its children. As previously mentioned we need to use the api/administrative/divisions/children API with the parents' id. So here it would be necessary to use a field resolver. At which point we'd need to cache it on the key <parent __typename>.<parent id>.childDivisions.<field args>, but then again, this should only cache references that again can be resolved by __resolveReference, where it might already have been cached by a previous query.

Note In the case of resolving the parentDivision for a AdministrativeDivision the field cache would be fine as we could wrap the field resolver for the parent on the administrative divisions because its only one node and not an array. But even then where to find the id of the object being resolved will be different for each case and require a slight modification to the implementation of each resolver.

This was very caching heavy but there are other considerations. This would also be extremely useful in other parts of our GraphQL layer where this level of importance on caching is not present. But I can't really go through our entire GraphQL layer in this detail. Hopefully you are able to see how this might extend into other use-cases.

Other considerations

This was discussed in #1055, just not in the form of a feature request. @queerviolet mentions:

Our current recommendation here is to take your __resolveReference(forType:) logic and put it in a function that's accessible to all resolvers.

In a way we already do this in the form of RESTDataSource, but there is still the issue of duplicate data caching and always needing a slightly modified implementation for the resolver.
This type of verbosity was also addressed in #398.

We should also consider separation of concern. A administrative divisions (admin div) might be related to a country but its not really the job of admin div's resolvers to resolve this type.

Types may be able to resolve with references from the query stage, but if a field resolver needs to be used, this resolver should still be able to return just references. As the source from which the field is resolved may not resolve the entire return type. For example if the data is acquired from a relation table in an SQL database where each row only contains the the ids of the objects that are related.

Otard95 avatar Jul 05 '22 12:07 Otard95

Yes please!

dbsmck avatar Aug 03 '23 12:08 dbsmck