federation
federation copied to clipboard
__resolveReference within the same service
The feature
The idea is simple and the title pretty much explains it.
We (me and my colleges, and hopefully the community) would love to see the
__resolveReference
be able to rehydrate its parent type, even if the reference
comes from the same service.
Our packages
Our situation
This would of course be useful in many parts of our GraphQL layer. But I'll use our 'geo'(geo-location) sub-graph as an example. I'll jot down some of our types so I can refer to them in my examples:
"""
Definition: [Wikipedia](https://en.wikipedia.org/wiki/Administrative_division)
"""
type AdministrativeDivision implements Node @key(fields: "id") {
id: ID!
country: Country!
# ... some info on this node
parentDivision: AdministrativeDivision
childDivisions: [AdministrativeDivision!]
postCodes: [PostalCode!]
}
type Country implements Node @key(fields: "id") {
id: ID!
# ... some info on this node
"""
All the highest level administrative divisions in the country
"""
administrativeDivisions: [AdministrativeDivision!]!
}
type PostCode implements Node @key(fields: "id") {
id: ID!
# ... some info on this node
country: Country!
"""
The parent administrative divisions in which this post code resides.
"""
administrativeDivisions: [AdministrativeDivision!]!
}
Response Cache
Because this is the type of data that doesn't really change much over time, caching is very
important for us. Using apollo-server-plugin-response-cache
would not be
optimal for us as you would end up with a lot of duplicate data cached. As
for example a query to get PostCode
's given either id
or the postal code
itself, assuming you also ask for its AdministrativeDivision
's, would end up
caching lots of duplicates of AdministrativeDivision
's. There will be lots of
different post codes in the same administrative division. With
apollo-server-plugin-response-cache
only identical requests will utilize the
cached data which will include the administrative divisions.
Example:
query GetPostCodeById {
getPostCode(id: 123) {
id
administrativeDivisions {
id
# other fields
}
}
}
and
query GetPostCodeById {
getPostCode(id: 456) {
id
administrativeDivisions {
id
# other fields
}
}
}
would be two different cache entries, and if they both have the same administrative
divisions we now have duplicates. Because of this we cannot use the cached
AdministrativeDivision
's because they are coupled to a specific post code.
Caching on data fetching layer
Caching on the data fetching layer, in our case RESTDataSource
from
apollo-datasource-rest
, would mitigate some of this duplication, but not
entirely eliminate it. It also has its own issues.
In the case of duplication, we have multiple APIs that provide data for the
same type. Like for example AdministrativeDivision.childDivisions
that uses
api/administrative/divisions/children
and for divisions by id like
AdministrativeDivision.parentDivision
that uses api/administrative/divisions
.
As mentioned it also has its own issues. Primarily that the overwritable method
cacheKeyFor
has no obvious/good way of utilizing the cacheHint or provide
similar options.
How this feature would solve such issues
The solution we wanted to go with is a wrapper for our resolvers that would
be used to determine the cache key based on the returned nodes' __typename
and id.
Or in the cases where that is not possible with a cache key based on the
parent __typename
, id, the resolved field, and any arguments.
This would allow for example the following query
getAdministrativeDivisions(id: ID!): AdministrativeDivision!
to cache the normalized result with only a reference to its parent, instead
of the entire object. This way our cache on the key <__typename>.<id>
would only hold the data related to that object and any references it may have.
Using the same wrapper on __resolveReference
it would then be able to resolve
that cached reference and cache its own object with references.
Which also matches up perfectly with the API for getting administrative
divisions as it only returns an id for the parent.
In the case of AdministrativeDivision.childDivisions
, the API for getting
administrative divisions returns no references ids for its children. As
previously mentioned we need to use the api/administrative/divisions/children
API with the parents' id. So here it would be necessary to use a field resolver.
At which point we'd need to cache it on the key
<parent __typename>.<parent id>.childDivisions.<field args>
, but then again, this
should only cache references that again can be resolved by __resolveReference
,
where it might already have been cached by a previous query.
Note In the case of resolving the parentDivision
for a AdministrativeDivision
the field cache would be fine as we could wrap the field resolver for the parent
on the administrative divisions because its only one node and not an array. But even
then where to find the id of the object being resolved will be different for each
case and require a slight modification to the implementation of each resolver.
This was very caching heavy but there are other considerations. This would also be extremely useful in other parts of our GraphQL layer where this level of importance on caching is not present. But I can't really go through our entire GraphQL layer in this detail. Hopefully you are able to see how this might extend into other use-cases.
Other considerations
This was discussed in #1055, just not in the form of a feature request. @queerviolet mentions:
Our current recommendation here is to take your
__resolveReference(forType:)
logic and put it in a function that's accessible to all resolvers.
In a way we already do this in the form of RESTDataSource
, but there is still
the issue of duplicate data caching and always needing a slightly modified
implementation for the resolver.
This type of verbosity was also addressed in #398.
We should also consider separation of concern. A administrative divisions (admin div) might be related to a country but its not really the job of admin div's resolvers to resolve this type.
Types may be able to resolve with references from the query stage, but if a field resolver needs to be used, this resolver should still be able to return just references. As the source from which the field is resolved may not resolve the entire return type. For example if the data is acquired from a relation table in an SQL database where each row only contains the the ids of the objects that are related.
Yes please!