cpython icon indicating copy to clipboard operation
cpython copied to clipboard

make `asyncio.iscoroutinefunction` a deprecated alias of `inspect.iscoroutinefunction` and remove `asyncio.coroutines._is_coroutine`

Open graingert opened this issue 1 year ago • 21 comments

currently asyncio.iscoroutinefunction and inspect.iscoroutinefunction behave differently in confusing and hard to document ways. It's possible to bring them into alignment but I think it would be better to make asyncio.iscoroutinefunction a deprecated alias of inspect.iscoroutinefunction and remove asyncio.coroutines._is_coroutine.

This is now possible with the recent removal of @asyncio.coroutine and support for AsyncMock and other duck-type functions in inspect.iscoroutinefunction

the only caveats are - what should happen to users of the asyncio.coroutines._is_coroutine mark? eg: https://github.com/python/cpython/blob/1c0cf0aef4b43b875b0191fd6f8956c929694a1f/Lib/asyncio/tasks.py#L686-L696

and all of these:

Lib/test/test_asyncio/test_base_events.py:    m_socket.getaddrinfo._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_reader._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_writer._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_reader._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_writer._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_reader._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_writer._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        m_socket.getaddrinfo._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        m_socket.getaddrinfo._is_coroutine = False
Lib/test/test_asyncio/test_base_events.py:        self.loop._add_reader._is_coroutine = False
Lib/test/test_asyncio/test_selector_events.py:        self.loop.add_reader._is_coroutine = False
Lib/test/test_asyncio/test_subprocess.py:        protocol.connection_made._is_coroutine = False
Lib/test/test_asyncio/test_subprocess.py:        protocol.process_exited._is_coroutine = False
Lib/unittest/mock.py:    mock._is_coroutine = asyncio.coroutines._is_coroutine
Lib/unittest/mock.py:        # iscoroutinefunction() checks _is_coroutine property to say if an
Lib/unittest/mock.py:        self.__dict__['_is_coroutine'] = asyncio.coroutines._is_coroutine
  • PR: gh-99247

graingert avatar Jul 17 '22 00:07 graingert

changing _wrap_awaitable to be:

async def _wrap_awaitable(awaitable): 
   """Helper for asyncio.ensure_future(). 
  
   Wraps awaitable (an object with __await__) into a coroutine 
   that will later be wrapped in a Task by ensure_future(). 
   """ 
   return await awaitable

would have a few subtle changes - eg the if not called_wrap_awaitable check in asyncio.ensure_future(...) isn't needed anymore, and this would prevent wrapping objects that have .__dict__["__await__"] callables

graingert avatar Jul 17 '22 00:07 graingert

~I'm pretty sure all of the assignments in the test suite of protocol.process_exited._is_coroutine = False etc, are redundant or worse, invalidated by https://github.com/python/cpython/pull/94050~ Edit: yes they are redundant see https://github.com/python/cpython/pull/94926/files#r922839081

the mock._is_coroutine = asyncio.coroutines._is_coroutine use in unittest.mock is also probably safe to remove?

graingert avatar Jul 17 '22 00:07 graingert

aha I've just checked and the protocol.process_exited._is_coroutine = False lines were made redundant in https://github.com/python/asyncio/pull/459

graingert avatar Jul 17 '22 13:07 graingert

So that's the ancient external asyncio project, which nobody should use any more. Does the corresponding code (or corresponding commits) exist in the stdlib version?

gvanrossum avatar Jul 17 '22 16:07 gvanrossum

So that's the ancient external asyncio project, which nobody should use any more. Does the corresponding code (or corresponding commits) exist in the stdlib version?

Yep, the fix was ported everywhere but the now redundant workarounds were left in, there's more info here https://github.com/python/cpython/pull/94926/files#r922839081

graingert avatar Jul 17 '22 17:07 graingert

Before making it a deprecated alias, it would be better to investigate:

  • How much code will be affected by this change?
  • How this interacts with cython coroutines?

kumaraditya303 avatar Jul 18 '22 13:07 kumaraditya303

Before making it a deprecated alias, it would be better to investigate:

* How much code will be affected by this change?

how about if I just add a deprecation notice and leave the implementation alone?

* How this interacts with cython coroutines?

inspect.iscoroutinefunction now uses inspect._signature_is_functionlike and so it should 'just work', I've let them know and will work on a cython test case to make sure

graingert avatar Jul 18 '22 14:07 graingert

how about if I just add a deprecation notice and leave the implementation alone?

That's rarely a good strategy. Many people don't read the docs and keep using the function (if they are using it) and so once the deprecated function is deleted their code breaks without warning.

gvanrossum avatar Jul 18 '22 18:07 gvanrossum

how about if I just add a deprecation notice and leave the implementation alone?

That's rarely a good strategy. Many people don't read the docs and keep using the function (if they are using it) and so once the deprecated function is deleted their code breaks without warning.

The alternative is asyncio.iscoroutinefunction starts returning False when it used to return True, which will break user code in a silent way

graingert avatar Jul 18 '22 18:07 graingert

I thought we were talking about putting a warning in it? It should only warn when it's going to return True but the inspect function would return False.

gvanrossum avatar Jul 18 '22 19:07 gvanrossum

It should only warn when it's going to return True but the inspect function would return False.

ok I've pushed that change

graingert avatar Jul 19 '22 08:07 graingert

the only caveats are - what should happen to users of the asyncio.coroutines._is_coroutine mark?

The _is_coroutine marker (for later use with asyncio.iscorountinefunction()) is leveraged by asgiref and Django to mark sync functions that return an awaitable, and are otherwise not detectable as such.

Is there an official way to do this? (I need to dig into the inspect version of the check to see if we can mimic what we're doing already, but on the initial pass substituting in inspect fails).

If not can we pause before removing it?

@gvanrossum had https://github.com/python/cpython/issues/67707#issuecomment-1093675953 — which suggests just calling the thing and seeing if it returned a coroutine object, but in Django's case we have a middleware and view functions which we're adapting for later use. Having a way of saying, No this is a coroutine function just by inspecting is pretty essential. 😬

Thanks!

carltongibson avatar Oct 29 '22 14:10 carltongibson

the only caveats are - what should happen to users of the asyncio.coroutines._is_coroutine mark?

The _is_coroutine marker (for later use with asyncio.iscorountinefunction()) is leveraged by asgiref and Django to mark sync functions that return an awaitable, and are otherwise not detectable as such.

Is there an official way to do this? (I need to dig into the inspect version of the check to see if we can mimic what we're doing already, but on the initial pass substituting in inspect fails).

The question is, what do those libraries do differently when the result is true or false? Asyncio itself only uses asyncio.iscoroutinefunction() in a few places where it wants to fail earlier, with a clearer error message, when the user passes something that isn't going to work -- in particular, places that require a callback function to be passed where users are particularly likely to be confused and pass in a coroutine.

But I'm guessing you have a different use in mind.

If not can we pause before removing it?

I'm happy to pause this work until you come back with an answer to the above question (but don't wait until 3.12 is deemed feature complete please).

@gvanrossum had #67707 (comment) — which suggests just calling the thing and seeing if it returned a coroutine object, but in Django's case we have a middleware and view functions which we're adapting for later use. Having a way of saying, No this is a coroutine function just by inspecting is pretty essential. 😬

Let me see if I understand this. You have some code that essentially wants to tread

async def callback1(...): ...

differently from

def callback2(...): ...

Maybe if it's an async def you want to await it while if it's a plain function you want to use loop.call_soon() on it; or maybe something else, (it might help if I knew what).

But now there's a special case

async def _real_callback3(...): ...

def callback3(...):
    return _real_callback3(...)

and you want to treat callback3 the same as callback1, which currently you can accomplish by adding

callback3._is_coroutine = asyncio._is_coroutine

But why?

Couldn't you just write

async def _real_callback3(...): ...

async def callback3(...):
    return await _real_callback3(...)

?

gvanrossum avatar Oct 31 '22 18:10 gvanrossum

Hi @gvanrossum — Thank you for your thoughtful reply, very insightful. 😊

… don't wait until 3.12 is deemed feature complete please.

Absolutely. Let's see if we can resolve this now, not least because selfishly we'd need a migration strategy if this goes away.

Your initial analysis looks correct:

async def callback1(...): ...


async def _real_callback3(...): ...

def callback3(...):
    return _real_callback3(...)

... you want to treat callback3 the same as callback1, which currently you can accomplish by adding

callback3._is_coroutine = asyncio._is_coroutine

This is exactly what we're doing.

But why?

Paraphrasing:

Couldn't you just write the sync wrapper function as async def?

The question is, what do those libraries do differently when the result is true or false?

Let me give you the examples.

Django Views

So, Django being a web-framework, views are callables that turn requests into responses.

At its simplest, you declare a function to do this:

def view(request, *args, **kwargs): ...

You can also define equivalent async views:

async def view(request, *args, **kwargs): ...

Both types of view can be run under both WSGI and ASGI:

  • Under WSGI async def views are run using asgiref's async_to_sync helper, which is a bit like asyncio.run().
  • The dual of that, under ASGI def views are run using asgiref's sync_to_async helper, which handles the run_in_executor manoeuvre to dispatch the sync function in a thread pool, but also ensures that the correct thread is used where we have e.g. ORM operations in play.

So far so good, if you have an async def view iscoroutinefunction() should correctly identify it.

But Django also allows you to define class-based views. Normally these define handlers per-HTTP verb, so:

from django.views import View


class MyView(View):
    def get(self, *args, **kwargs): ...

These can also define async handlers:

class MyView(View):
    async def get(self, *args, **kwargs): ...

In order to have per-request view instances, allowing you to store state on self, Views are not directly callable but rather define an as_view() classmethod that returns a wrapper function responsible for instantiating the View instance and dispatch the right method handler, depending on the incoming request.

There are several (sync) layers involved here — as_view() to the view callback to View.dispatch() to the method handler, such as get() — but the final execution by either the WSGI or the ASGI handler needs to know whether the method handler is sync or async in order to correctly dispatch it.

We're doing that currently by marking the wrapper function returned from as_view() with asyncio.coroutines._is_coroutine, since even though the wrapper is sync its ultimate return value is a coroutine that must be awaited.

In this case, I think we could branch on the definition of the returned view to make it async def.

My concern would be the amount of duplication, over simply setting an attribute on the function.

Django Middleware

The example with middleware is a little more complex.

Before calling a view Django passes the request through a series of middleware, which act kind of like decorators, acting before and after the next item down the chain, the last of which will in the end be the view itself.

Like views, middleware can be either sync or async, but they can also support both calling patterns.

In order to avoid switching more than necessary between sync and async contexts whilst processing the middleware chain, we adapt (again using the asgiref.sync utilities) each middleware to the calling style of the callable it wraps, before finally adapting the outer middleware to the WSGI or ASGI handler as may be. This adaptation is done once at startup when loading the middleware chain, which precludes calling the middleware and then awaiting it if the return happens to be a coroutine.

For class based middleware (which unlike views are single instance callables) supporting both async and sync modes we use the asyncio.coroutines._is_coroutine marker to know to switch to the async implementation in __call__. (This one seems like a bit of a sticker.)

asgiref.sync

The SyncToAsync class marks itself with asyncio.coroutines._is_coroutine. The class has an async def __call__() but that's not sufficient to get through asyncio.iscoroutinefunction().

Commenting out the mark:

>>> import asyncio
>>> from asgiref.sync import SyncToAsync
>>> def f():
...    print(1)
... 
>>> s = SyncToAsync(f)
>>> asyncio.iscoroutinefunction(s)
False  # Not the actual behaviour, which sets the _is_coroutine marker

Whether this behaviour is essential, I cannot immediately say, but the asgiref test suite has an explicit check that a SyncToAsync instance is correctly detected as a coroutine.

Thoughts then...

Hopefully that gives you an insight into how we got here. Needing to identify a coroutine function, no matter how it came to be, seem(s|ed) quite important to be able to integrate async into Django.

Looking at the Django middleware and asgiref.sync cases, possibly we could rewrite, but I suspect (at least in the Django case) we'd need to put a shim in place to look for a marker before calling the stdlib implementation.

OK, that's do-able.

But I wonder, is it that we're doing this wrong? Is it really not a thing to have a sync function returning an awaitable? 🤔

The question is, what do those libraries do differently when the result is true or false?

They use the result to adapt either ourselves or the function to the match the required calling style.

Why is this necessary? Because we're dealing with both sync and async code and we can't say upfront which it's going to be.

Could we re-write it? Yeah, maybe. Likely, maybe, even. But having a marker has proven pretty handy so far. It would be quite disruptive (at the least) if it were to disappear.

Thanks for considering. Really happy to take input 🙂

🎁

//cc @andrewgodwin

carltongibson avatar Nov 01 '22 12:11 carltongibson

Some quick thoughts:

The root of the "problem" is something that seems more prevailing in web frameworks than in the core of Python, i.e., introspecting something and taking a different action based on the outcome. A very old example is web template libraries that can name objects that may or may not be callable, and have no explicit "call it" syntax -- they just implicitly call when the object is callable.

In the core we prefer explicit over implicit, or we'd use a method that each type can redefine. But the web world is different, and I'm okay whit that.

I think that the way forward is a decorator that sets the marker, and that decorator ought to be part of the inspect API, so that inspect can look for it. Would you be willing to help design such an API and implement it?

gvanrossum avatar Nov 01 '22 17:11 gvanrossum

Just to extend from what Carlton said on this - as the original author of all the async piping through Django, the key thing I needed to know when I first designed all this is if a callable is going to return a coroutine or not without calling it, as if it's a synchronous function then we've already accidentally triggered the side-effects if we call it. While I like the simplicity of the Python design of "asynchronous callables are just callables that return a coroutine", this is one case where it's quite hard to duck-type as you can't inspect the object without potential side-effects.

It's essentially missing type information - we could use type hints at runtime to, in theory, perform the same marking, and we could also indeed just vendor our own mechanism for this and continue placing an attribute on the callable object (after all, most of what is happening here is Django view classes being inspected by Django request flows, so we do control both ends).

You are totally correct that in a nice, clean implementation from a blank slate, we'd have something explicit like two different kinds of URL router or middleware definition, or maybe even just make it all coroutine-by-default - alas, backwards compatibility meant we had to continue some of Django's existing design patterns to avoid making a huge amount of required upgrade work for people.

When you propose a decorator that sets a marker, are you thinking something similar to _is_coroutine and just adding a nicer interface? I'd personally be happy to help design and/or vet something in inspect that allows us to reliably mark and determine if something is about to return a coroutine or not.

andrewgodwin avatar Nov 02 '22 00:11 andrewgodwin

When you propose a decorator that sets a marker, are you thinking something similar to _is_coroutine and just adding a nicer interface? I'd personally be happy to help design and/or vet something in inspect that allows us to reliably mark and determine if something is about to return a coroutine or not.

Yes, that's exactly what I was thinking. We need to think a bit carefully about deprecating asyncio.iscouroutinefunction().

gvanrossum avatar Nov 02 '22 00:11 gvanrossum

Thanks for your understanding @gvanrossum.

A very old example is web template libraries that can name objects that may or may not be callable, and have no explicit "call it" syntax -- they just implicitly call when the object is callable.

Yes. The Django template language ≈ does this. 😊

Would you be willing to help design such an API and implement it?

Ha. Sneaky. 🙂 Yes, OK, fair's fair — I'm happy to input as well.

/me looks at https://peps.python.org/pep-0693/#schedule 🤔

carltongibson avatar Nov 02 '22 07:11 carltongibson

This shouldn't require a PEP or anything like that. Just think through the API a bit. You could start by writing the code and reason back from what you learned by trying to do that.

gvanrossum avatar Nov 02 '22 18:11 gvanrossum

I can certainly begin. Let me track down the tests for inspect.iscoroutinefunction() and see if I can add some failing cases. I'll open a draft PR to discuss.

carltongibson avatar Nov 02 '22 19:11 carltongibson

Great. Please add me to the PR.

gvanrossum avatar Nov 02 '22 19:11 gvanrossum