promise
promise copied to clipboard
DataLoader batch_load_fn + load_many not batching in graphene
This test case doesn't appear to work when using graphene:
https://github.com/syrusakbary/promise/blob/master/tests/test_dataloader.py#L32
The batch load function is called with one key at a time when calling dataloader.load_many([6, 7])
. I wasn't able to make that test case break by running it many times or with large (100,000) array sizes, which is why I suspect it's due to an interaction with graphene. It works in v2.0.2
(the pip install promise
version), but not promise/master
.
Here is roughly the code that I'm running:
class HeroLoader(DataLoader):
def batch_load_fn(self, friend_ids):
import logging; logging.error(friend_ids) ########### Console log #2 here
url = '{}/?id={}'.format(
Hero.endpoint,
','.join([str(id) for id in friend_ids])
)
response = requests.get(url)
results = response.json()['results']
return Promise.resolve(results)
class Hero(graphene.ObjectType):
endpoint = '{}/heroes'.format(HOST)
data_loader = HeroLoader()
id = graphene.Int()
name = graphene.String(name='name')
friend_ids = graphene.List(graphene.Int)
friends = graphene.List(partial(lambda: Hero))
def resolve_friends(self, args, context, info):
import logging; logging.error(self.friend_ids) ######### Console log #1 here
heroes_json = self.data_loader.load_many(self.friend_ids)
return heroes_json.then(do_stuff_with_promise......)
The console output of an infringing request looks like this (and stepping through with a debugger yields the same results):
ERROR:root:[6, 7]
ERROR:root:[6]
ERROR:root:[7]
The expected output is
ERROR:root:[6, 7]
ERROR:root:[6, 7]
If you want a test that works with one version of promise, but not the other:
- Use this repo and branch: https://github.com/curiousest/graphql-to-rest/tree/failing-promises
-
py.test --capture=no -k 'test_batch_requests' -s
Reviving this issue as we're experiencing the same problem. Version 2.0.2
works as expected but version 2.1.0
broke this batching functionality.
There seems to be a related issue in the Graphene project: https://github.com/graphql-python/graphene/issues/1075
Try adding a @Promise.safe to the resolve function. That solved it for me one time. I have no clue as to what this decorator does exactly though...
I've been working through this issue and the following worked for me.
from promise import set_default_scheduler
from promise.schedulers.asyncio import AsyncioScheduler
set_default_scheduler(AsyncioScheduler())
I tried using the scheduler
argument of the DataLoader
constructor, but found that when it constructs a Promise
(in its load
method), that scheduler isn't passed through. Later on, when a Promise
accesses its own scheduler attribute, it's None
, and falls back to the default scheduler, which is an instance of ImmediateScheduler
, which is why batching doesn't occur. By setting the default scheduler with set_default_schedulers
, both the DataLoader
and the Promise
use the same instance of AsyncioScheduler
.
If my understanding of the code is correct (and my "workaround" isn't just dumb luck), I'd be happy to submit a PR that passes the scheduler through.
set_default_scheduler() is dangerous:
class AsyncioScheduler(object):
def __init__(self, loop=None):
self.loop = loop or get_event_loop()
So if your requests are handled by threads and each thread has it's own event loop, then it would not work correctly.
Do you know of a better workaround? Otherwise, it sounds like if you're using threads to handle requests (which I'm not), the DataLoader
simply cannot be used as-is.
@tazimmerman unfortunately, I don't know of any workarounds when using threads.
We moved to using aiodataloader with graphene-django, and starting a new asyncio event loop for every request. I'm not sure if that's the right way to do it, but batching is working.
@tazimmerman unfortunately, I don't know of any workarounds when using threads.
We moved to using aiodataloader with graphene-django, and starting a new asyncio event loop for every request. I'm not sure if that's the right way to do it, but batching is working.
@amzhang Would you mind sharing an example? I'm in the middle of trying to get those two to work and can't seem to find the right place to create the event loop and gather all other results.
Try below. It's copied from our code base with irrelevant parts stripped, so has not been tested.
It uses aiodataloader
Hope this helps.
import asyncio
from promise.promise import Promise
from django.conf import settings
from django.http.response import HttpResponseBadRequest
from django.http import HttpResponseNotAllowed
from graphene_django.views import GraphQLView, HttpError
from graphql.execution import ExecutionResult
class CustomGraphQLView(GraphQLView):
# Copied from the source code with the following modifications:
# - Starting event loop before resolve so that asyncio dataloaders can run.
def execute_graphql_request(
self, request, data, query, variables, operation_name, show_graphiql=False
):
if not query:
if show_graphiql:
return None
raise HttpError(HttpResponseBadRequest("Must provide query string."))
try:
backend = self.get_backend(request)
document = backend.document_from_string(self.schema, query)
except Exception as e:
return ExecutionResult(errors=[e], invalid=True)
if request.method.lower() == "get":
operation_type = document.get_operation_type(operation_name)
if operation_type and operation_type != "query":
if show_graphiql:
return None
raise HttpError(
HttpResponseNotAllowed(
["POST"],
"Can only perform a {} operation from a POST request.".format(
operation_type
),
)
)
# See: https://docs.python.org/3/whatsnew/3.8.html#asyncio
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
extra_options = {}
extra_options["return_promise"] = True
if self.executor:
# We only include it optionally since
# executor is not a valid argument in all backends
extra_options["executor"] = self.executor
ret = document.execute(
root=self.get_root_value(request),
variables=variables,
operation_name=operation_name,
context=self.get_context(request),
middleware=self.get_middleware(request),
**extra_options,
)
if Promise.is_thenable(ret):
timeout = settings.GRAPHQL_VIEW["REQUEST_TIMEOUT_AFTER"].total_seconds()
loop.run_until_complete(asyncio.wait_for(ret.future, timeout))
return ret.get()
else:
return ret
except Exception as e: # pylint: disable=broad-except
return ExecutionResult(errors=[e], invalid=True)
finally:
asyncio.set_event_loop(None)
loop.close()
@amzhang Thanks for the quick reply! I'll take a look to see if this helps me get where I want to go.
@amzhang unfortunately it looks like this works on graphene-django v2 but not v3. Seems like the dataloader story in v3 is kind of weak at the moment.