graphql-core Discussion: decoupling async data loading from async graph resolution

Discussion: decoupling async data loading from async graph resolution

Open jmarshall9120 opened this issue 3 years ago • 0 comments

TLDR: I would like graph-core to tell me what "leaf" nodes from my query need returned. Allow me to fetch the data through whatever method I deem efficient. Then I would like graph-core to take care of returning the data. The current asyncio implementation, seems to fundamentally not work for this.

I've been digging in deep on a graphql-core build recently and have stumbled across this interesting problem. If I've missed a key feature of the library here, than please point it out.

To me, the ideal way to use the core library is to:

Use graph-core to decide what data to retrieve.
Use a separate data loading engine to read the data.
Use graph-core to return the data.

Its interesting to see where this problem fits in as either a graphe-core-3 issue, needing a feature, or a graphql issue. The essential catch is this, it's very hard to determine when the graph-core resolution has finished deciding what leaves from the graph need fetched. Here's an example to illustrate the point.

##########################################################
## TEST GRAPHE SCHEMA BASED ON ASYNCIO ######################
##########################################################
from graphql import (
    GraphQLBoolean, graphql, GraphQLSchema, GraphQLObjectType, GraphQLField, GraphQLString)
import logging
import asyncio

_logger = logging.getLogger('GrapheneDeferralTest')
_logger.setLevel('DEBUG')

query = """
{
    ioHMIControls {
        EStopHMI,
        JogHMI,
    }
}
"""

async def resolve_EStopHMI(parent, info):
    _id = '_EStopHMI_id'
    info.context['node_ids'][_id] = None
    await info.context['awaitable']
    return info.context['node_ids'][_id]
EStopHMI = GraphQLField(
    GraphQLBoolean,
    resolve=resolve_EStopHMI
)

async def resolve_JogHMI(parent, info):
    _id = '_JogHMI_id'
    info.context['node_ids'][_id] = None
    await info.context['awaitable']
    return info.context['node_ids'][_id]
JogHMI = GraphQLField(
    GraphQLBoolean,
    resolve=resolve_EStopHMI
)


def resolve_ioHMIControls(parent, info):
    return ioHMIControls
ioHMIControls = GraphQLObjectType(
    name='ioHMIControls',
    fields={
        'EStopHMI': EStopHMI,
        'JogHMI':JogHMI,
    }
)

def resolve_GlobalVars(parent, info):
    return GlobalVars
GlobalVars = GraphQLObjectType(
    name='GlobalVars',
    fields={
        'ioHMIControls': GraphQLField(ioHMIControls, resolve=resolve_ioHMIControls)
    }
)

async def simulate_fetch_data(_ids):
    print(_ids)
    await asyncio.sleep(1)
    return {k:True for k in _ids.keys()}
    
async def main():
    # Objective:
    #     1. Have graph determine what data I need by partially resolving
    #     2. Pause graph resolution.
    #     3. Collect data into a `data_loader` object.
    #     4. Retrieve data via `data_loader` object.
    #     5. Resume graph resolution with loaded data.

    # 3. collect ids of data fields into a dict
    _ids = {}

    #2. pause graph resolution by awaitn a future
    future = asyncio.Future()
    context = {
        'node_ids': _ids,
        'awaitable': future,
    }
    schema = GraphQLSchema(query=GlobalVars)

    # 1. Determine WHAT data to return
    resove_graph_task = asyncio.create_task(graphql(schema, query, context_value=context))

    # ?
    # There is no way to detect that resolve_graph_task
    # has finished fillin _ids dict with id values.

    # 4. Fetch the data
    fetch_data_task = asyncio.create_task(simulate_fetch_data(_ids))

    # ? 
    # This await doesn't work in this order or any order
    # becaus of the interdependancy of both tasks, coupled with 
    # the mechanics of asyncio.
    await fetch_data_task

    # 5. Resume graph resolution with retrieved data.
    future.set_result(0)

    # ? 
    # return the data from the graph, as a graph result. 
    # problem, is that the data is not there due to 
    # interdependancy between await tasks. 
    result = await resove_graph_task
    print(result)

if __name__ == '__main__':
    asyncio.run(main())

Results

{}
ExecutionResult(data={'ioHMIControls': {'EStopHMI': None, 'JogHMI': None}}, errors=None)

The example is a little long, but I wanted it to be sufficiently complex. The gist is that there is no way in the current asyncio implementation to determine that: all resolvers have been reached.

Looking at the implementations we could use some advanced event systems to manage this, but it would be a bit of work. Another possible solution could be to allow resolvers to return coroutines and put off type checking till those coroutines are themselves resolved. I think, this may be the most elegant method.

Thoughts?

May 25 '22 17:05 jmarshall9120

graphql-core graphql-core copied to clipboard

Discussion: decoupling async data loading from async graph resolution

graphql-core
graphql-core copied to clipboard