Supporting Batch Resolvers in GraphQL-Ruby
There are some promising experiments out there (https://www.youtube.com/watch?v=bL2JCd1lo80, https://github.com/gmac/graphql-breadth-exec) about using "batch resolvers" to simplify data fetching for GraphQL and make GraphQL queries run faster.
In short, it involves changing the call signature of resolvers:
- resolve(object, arguments, context)
+ resolve(objects, arguments, context)
So that for a list selection like users { profilePic(size: LARGE) }, User.profilePic is resolved for the whole list of Users at once. Then, the GraphQL engine handles each object and continues executing.
This addresses the same problem as GraphQL::Dataloader and GraphQL::Batch, but in a totally different way. Both of the existing solutions run a query without any real knowledge of batching; batching is "slapped on" after the fact. But batch resolvers would recognize the need to handle similar objects in similar ways at the core of GraphQL execution.
Soon, I intend to add support for this flow in GraphQL-Ruby and I'm opening this issue in case anyone wants to discuss the idea.
Before I add this, I want to explore a few much-needed cleanups to the runtime code (eg #5389, #5422). I think paying down some tech debt will make it easier to add this feature ("make the change easy...").
👏 Exciting trajectory!
FWIW – we actually intend to take Cardinal execution one step further than just single batched fields, which still fall short on cases such as single fields that commonly load multiple times via aliases, or different fields that commonly load and resolve similar dependencies. For these cases we intend to establish a field aggregation, where we'd automatically defer and group aggregation fields, then call aggregation resolvers with a set of fields (and each of those fields have a set of objects).
Basically – promises require backtracking, and we don't like that. Think of this paradigm as eager deferment.
Also FWIW, our Cardinal shim that puppets the GraphQL Ruby runtime now passes almost our full test suite, and I just got some initial numbers from a production test today... looks like the shim's p90 about matches the native gem's p50 in this test query that loads large volumes of products and their variants. The shim is winning by -700ms on a 5.5s query.
So – takeaway there is that just tossing depth recursion and running evaluate_selection down to evaluate_selection_with_keyword_args on a tight loop across a breadth set is mostly backwards compatible and still seems to be a speed advantage. Of course, switching to a batch resolver is considerably faster. I suspect a major win is that we only have one skeleton of the depth tree, and thus only ever run gather_selections and friends once per document field.
Hey @gmac, I'm not sure I totally understand... On these to points:
tossing depth recursion and running
evaluate_...... on a tight loop
and
switching to a batch resolver
I was thinking those two were the same -- do you agree? Or are you drawing a contrast between them here?
They are very different, but produce a similar result. One is backwards-compatible, the other is net-new. So basically, our breadth engine says we're always running sets of concretely-typed objects:
objects = [<product1>, <product2>, <product3>]
The fastest way we've found to execute those is to use resolve(objects...,), passing in the full set and letting the mapping implementation be as fast as our application logic (we want our application to be the bottleneck, not our engine). That looks something like this:
def resolve(objects, args, context)
results_by_obj_id = Thing.where(obj_id: objects.map(&:id)).index_by(&:obj_id)
objects.map { results_by_obj_id[_1.id] }
end
However, this batch resolver is a net-new pattern. It's not backwards compatible. So what I was driving at was that just doing this horizontal resolution pattern using the existing runtime on a generational tight loop (using the basic pattern outlined here) seems to still be faster while remaining almost entirely backwards compatible, ie:
ast = get_ast_nodes_for_this_selection
next_objects = objects.map { |obj| runtime.evaluate_selection(obj, ast, ...) }
A major source of improvement from what I can tell is that we're cutting out evaluate_selections (plural) where gather_selections would run repeatedly for the same nodes across the response tree. My point then is...
- batch resolvers are ideal as a forward trajectory.
- the current resolver model could stay pretty much the same while getting slightly faster.
- we want both resolver models in parallel so that teams can keep their existing schema executing (slightly faster) as-is while starting to swap in the new batched pattern.