asgiref
asgiref copied to clipboard
sync_to_async for converting generators to async generators
Fixes https://github.com/django/asgiref/issues/142
The test is failing on python3.5 because it doesn't have async generator support. What's the right thing to do here? Is throwing a RuntimeError
fair game?
Hm, asgiref does support Python 3.5 for now, so we need to do something, or do a major version bump. I think acceptable behaviour would be to throw a RuntimeError when you try to use this on 3.5, explaining the situation, and then write a test that's skipped on 3.5 for the main implementation, and then a test only for 3.5 that tests that the error is thrown.
2c: perhaps a TypeError
is a more appropriate exception class, since the error case is the decorator being applied to the wrong type of callable.
I don't mind too much; I generally use RuntimeError for things a bit more serious/unsolvable, but this could go either way. We can't mirror the underlying error, anyway, since it's a SyntaxError, so that's at least out of the question.
Actually, I think I can make this support python 3.5. Let me try my hand with __aiter__
and __anext__
from PEP 492.
This should be good to go now.
Is there some simpler version of this we could do if we drop support for some of the earlier Python versions asgiref currently supports?
If you're talking about the inspect.isgeneratorfunction()
call, then, I don't think we're in luck here.
That seems to be the only way to detect if a given function will return a generator, without actually calling that function and checking the return value.
Let me know if you find something else!
Quick question - what purpose does self._is_coroutine
serve in SyncToAsync.__init__()
?
@devxpy That's needed to signal to asyncio.iscoroutinefunction
that it should return True. We'll need to keep that for anything that outputs an async-like function.
Okay, left that unchanged.
Would you like performance tests written in pytest-benchmark?
No, no, just a quick one-off test to make sure it's not massively worse. timeit
or something will do and then paste the results here.
This PR -
In [6]: from asgiref.sync import sync_to_async
In [4]: def fn():
...: pass
...:
In [7]: %timeit sync_to_async(fn)
4.93 µs ± 70.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
On current pypi version -
In [1]: from asgiref.sync import SyncToAsync
In [2]: def fn():
...: pass
...:
In [3]: %timeit SyncToAsync(fn)
3.35 µs ± 186 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Sorry this took longer than expected
No worries - it might take me some time to get to this, unfortunately, as it's a very complex PR and reviewing it is going to take a lot of effort to make sure it doesn't subtly affect something else downstream (we may even need to cut a new major release just in case).
Hi,
A few hopefully-helpful comments:
- I really don't like the kind of "magic" done here, to try to automatically determine if the thing is a function, generator, or iterable. In particular:
- If an object is both callable and iterable (i.e. it has both call and iter methods), then there is no way to decide automatically if a function-wrapper or iterable-wrapper is desired. You need a way for the caller to say what they want, and separate function names is the clearest way to do that. (Perhaps sync_function_to_async and sync_iterable_to_async).
- If a generator is wrapped somehow, then inspect.isgeneratorfunction() might not work. E.g. partial() on a generator isn't detected by inspect.isgeneratorfunction() in Python <3.8; there are likely plenty of decorators etc that have similar issues.
- If a generator function takes a long time to run when first called (e.g. retrieving some data), but is non-blocking after that, the old behaviour is actually better. E.g. caller might do X=list(await sync_to_async(get_data)). Having the choice would be useful (e.g. having sync_function_to_async() give the old behaviour and sync_generator_to_async() the new behaviour).
- To someone reading the code, it can require a lot of work to figure out which path is taken. If you have separate functions then it's obvious from the function name.
-
The API allows iterables to be wrapped (i.e. things you can call iter() on) but not iterators (i.e. things you can call next() on). The code actually implements wrapping iterators, it's needed to support wrapping iterables. So it would be good to expose that functionality in the public API.
-
Changing the behaviour of the existing function when passed generators is a non-backward-compatible change.
-
The generator wrapping here doesn't handle all the complexities of generators. It just treats them as iterables, but they're actually a lot more complex. The caller can send a value into a generator (and it gets returned by the "yield" statement in the generator). The caller can make a "yield" statement throw an exception, and there is support for try/catch/finally around those yield statements. See PEP 342 for synchronous generator magic and PEP 525 for asynchronous generators.
So I suggest four separate new APIs - sync_function_to_async, sync_generator_to_async, sync_iterable_to_async, sync_iterator_to_async. Plus keep sync_to_async for backward compatibility, as an alias of sync_function_to_async.
If you really want a "magic" function that tries to autodetect what kind of wrapping to do, please make that a separate function.
Kind regards,
Jon
I do agree somewhat with Jon here - the sheer work required to make this polymorphic to every possible callable type maybe takes away from its understandability (I've tried three times now to code review this and I'm finding it very tough). It might be better to go for a simpler, more bulletproof approach.
As a note for other async newcomers like myself, by far the easiest way to deal with this is just to move the entire block using querysets to a separate, non-async function and then wrap that function call in a sync_to_async
where you previously had the block. I wasted a huge amount of time thinking that I should be able to get it done in the async function, and it is almost certain to not be time worth spent for others either!
While that becomes clearer after you waste time then re-read https://docs.djangoproject.com/en/3.1/topics/async/, it might be nice to add a note to that page. Would that be a reasonable addition? Should I submit a PR?
Looking at this, I'm quite sceptical that we should be going anywhere near this.
For simple cases PEP 492 has an example wrapping a sync iterator — Let users do that.
For more complex cases (where you're iterating out of a DB maybe) you don't want to go back and forth between sync and async contexts. Better to make people fetch-all upfront, or otherwise rewrite their code to be actually async, than provide a shim that looks like it's the right thing to do.
I'd lean towards Out-of-scope/wontfix and being happy with that TBH.
Agreed - this has sat around for long enough because I simply don't think we can do a good enough job of it that will be less bad for users than saying "do the work and just make the whole loop async". There's a few cases we'll want to overcome - mapping streaming responses in Django comes to mind - but that kind of needs its own handler anyway.
There's a few cases we'll want to overcome - mapping streaming responses in Django comes to mind - but that kind of needs its own handler anyway.
I don't think we should that either TBH.
See https://github.com/django/django/pull/16384 and discussion on the ticket there. (Your thoughts welcome 🙏)
Ah, I did wonder, given our previous discussions on the subject. You did still suggest mapping the other case over, just with a lot of warnings, which is fine in my eyes :)
That's the idea. There are a few wiggles still but I'm currently optimistic 😜