aiobotocore icon indicating copy to clipboard operation
aiobotocore copied to clipboard

support pluggable http clients/coroutine frameworks (anyio/httpx)

Open graingert opened this issue 5 years ago • 26 comments

I'd like to be able to use aiobotocore in a project that uses trio and httpx

anyio provides a generic interface to curio/trio/asyncio that libraries can use to provide generic support for all async/await eventloops

aiobotocore would require users to pass a "request" coroutine function that provides all the http functionality that aiobotocore needs for http requests

graingert avatar Jan 11 '20 12:01 graingert

As far as I'm aware aiobotocore would need to replace all asyncio calls with the relevant anyio equivalent, and any gather/ensure_future with a nursary/spawn pattern instead

graingert avatar Jan 11 '20 12:01 graingert

should be done with https://github.com/aio-libs/aiobotocore/issues/619 which will cause a TON of refactoring and perhaps re-thinking this whole project

thehesiod avatar Jan 12 '20 23:01 thehesiod

I don't think #619 and this issue are really that coupled and could be completed in any order

graingert avatar Jan 13 '20 10:01 graingert

botocore/boto3 already supports an event callback for extensions and aiobotocore inherits that behavior without any modifications. See

  • https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html
  • use aiobotocore as a starting point to create a new base class and hook it up to a boto3 event when it creates clients.
    • https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html#creating-client-class

To focus only on the request component, if that's possible, consider details in #755 that describe how moto uses the event system to mock requests. It hooks into the before-send event. So long as the http_response returned is compatible with the rest of the response parsing, it should work. If it needs a custom response parser, register a parser-generator as a component in the client.

All of this is easily understood (mostly) from reading

  • https://github.com/boto/botocore/blob/develop/botocore/client.py

dazza-codes avatar Feb 19 '20 17:02 dazza-codes

the reason I said #619 (#659) is because it's going to greatly expand this codebase, and thus parts which would need to be refactored to support whats mentioned in this issue. IIRC there are sleeps/locks for example in the new credentials file we need to replace components in. @dazza-codes that's great insight for that one particular event, I just logged https://github.com/aio-libs/aiobotocore/issues/774 so we don't forget about it and apply it where possible. Given #659 moves us to async events a lot more will be possible.

thehesiod avatar Feb 23 '20 12:02 thehesiod

What is the roadmap here. I would contribute if there's a plan.

aiohttp dependency is discouraged in projects that use a different server web framework.

ediskandarov-cur avatar Jan 14 '21 10:01 ediskandarov-cur

@toidi step 1 would be to find any imports/calls to asyncio and replace it with the anyio equivalent

graingert avatar Jan 14 '21 10:01 graingert

@thehesiod are you ok with the proposal above?

ediskandarov-cur avatar Jan 14 '21 15:01 ediskandarov-cur

Let me take a look at anyio and report back

thehesiod avatar Jan 15 '21 04:01 thehesiod

ok looking at anyio that's not going to solve the http client portion, that tries to solve asyncio event loop plugging, which actually is already designed to be globally replaceable (see for example how uvloop works). From what I can see botocore is already working on a pluggable http client session, see: https://github.com/boto/botocore/blob/develop/botocore/endpoint.py#L280 however it's not piped to either Config or create_client yet from what I can see, seems like a WIP. So I think we need to wait to see what botocore is going to do. Another factor is that it's not as simple as just allowing a aiohttp replacement, you need to port all the exceptions as well (https://github.com/aio-libs/aiobotocore/blob/master/aiobotocore/endpoint.py#L156), I just went through that in fact on the most recent botocore update. looping in @terrycain .

thehesiod avatar Jan 18 '21 07:01 thehesiod

that tries to solve asyncio event loop plugging, which actually is already designed to be globally replaceable

I'm looking to replace the coroutine framework not the event loop

graingert avatar Jan 18 '21 08:01 graingert

ah I c, sorry this is a two prong request, probably should be split as they're not necessarily related

thehesiod avatar Jan 18 '21 08:01 thehesiod

Well it's mainly about the pluggable coroutine framework, as I'd like to be able to use aiobotocore on curio and trio applications. For this to happen asyncio and aiohttp both need replacing as they both only support asyncio event loops

graingert avatar Jan 18 '21 08:01 graingert

So, theres 2 pieces of work here. Making the event loop more customisable than just the asyncio backend implementation so we can use things like trio and curio. As well as making the http client swappable as aiohttp does not work with curio/trio. Both bits of work are required before any of this work is really usable.

In terms of swapping asyncio invocations to anyio, from a glance I don't see that causing any issues, it would be nice to see if there is any performance degradation with the switch, though it should just be a wrapper to the asyncio function call.

terricain avatar Jan 18 '21 10:01 terricain

the changes to aiobotocore for anyio will be quite minimal:

aiobotocore/_endpoint_helpers.py:    asyncio.TimeoutError,
aiobotocore/credentials.py:        self._refresh_lock = asyncio.Lock()
aiobotocore/credentials.py:        self._refresh_lock = asyncio.Lock()
aiobotocore/credentials.py:    def __init__(self, *args, popen=asyncio.create_subprocess_exec, **kwargs):
aiobotocore/endpoint.py:    # NOTE: The only line changed here changing time.sleep to asyncio.sleep
aiobotocore/endpoint.py:            await asyncio.sleep(handler_response)
aiobotocore/hooks.py:            if asyncio.iscoroutinefunction(handler):
aiobotocore/response.py:class AioReadTimeoutError(ReadTimeoutError, asyncio.TimeoutError):
aiobotocore/response.py:        except asyncio.TimeoutError as e:
aiobotocore/utils.py:RETRYABLE_HTTP_ERRORS = (aiohttp.client_exceptions.ClientError, asyncio.TimeoutError)
aiobotocore/utils.py:                except asyncio.TimeoutError:
aiobotocore/utils.py:    def __init__(self, session=None, sleep=asyncio.sleep):
aiobotocore/waiter.py:            await asyncio.sleep(sleep_amount)
asyncio.Lock() -> anyio.create_lock()
asyncio.create_subprocess_exec -> anyio.open_process
asyncio.iscoroutinefunction -> asyncio.iscoroutinefunction
asyncio.sleep() -> anyio.sleep()

and I think we'd need a TIMEOUT_ERRORS = (asyncio.TimeoutError, TimeoutError)

graingert avatar Jan 18 '21 10:01 graingert

the TimeoutErrors would have to be handled a bit more carefully as modern coroutine frameworks discourage "libraries from using internal timeouts. Instead, it encourages the caller to enforce timeouts, which makes timeout code easier to compose and reason about."

so this code:

https://github.com/aio-libs/aiobotocore/blob/3faf885644e01b01cffd945a63e079d6fd446ffa/aiobotocore/utils.py#L44-L72

would use anyio.move_on_after to handle timeouts per retry

    async def _fetch_metadata_token(self):
        self._assert_enabled()
        url = self._base_url + self._TOKEN_PATH
        headers = {
            'x-aws-ec2-metadata-token-ttl-seconds': self._TOKEN_TTL,
        }
        self._add_user_agent(headers)

        request = botocore.awsrequest.AWSRequest(
            method='PUT', url=url, headers=headers)

        async with self._session(timeout=None,
                                 trust_env=self._trust_env) as session:
            for i in range(self._num_attempts):
                with anyio.move_on_after(timeout):
                    try:
                        async with session.put(url, headers=headers) as resp:
                            text = await resp.text()
                            if resp.status == 200:
                                return text
                            if resp.status in (404, 403, 405):
                                return None
                            if resp.status in (400,):
                                raise BadIMDSRequestError(request)
                    except RETRYABLE_HTTP_ERRORS as e:
                        logger.debug(
                            "Caught retryable HTTP exception while making metadata "
                            "service request to %s: %s", url, e, exc_info=True)
                    except aiohttp.client_exceptions.ClientConnectorError as e:
                        if getattr(e, 'errno', None) == 8 or \
                                str(getattr(e, 'os_error', None)) == \
                                'Domain name not found':  # threaded vs async resolver
                            raise InvalidIMDSEndpointError(endpoint=url, error=e)
                        raise
        return None

graingert avatar Jan 18 '21 10:01 graingert

the similar discussion in HTTPX it didn't seem to invest into anyio yet

if aiobotocore would go for pluggable HTTP client - coroutine framework is not a requirement

https://github.com/encode/httpx/issues/296

ediskandarov-cur avatar Jan 18 '21 14:01 ediskandarov-cur

@toidi httpx uses its own abstraction https://github.com/encode/httpcore/blob/master/httpcore/_backends/anyio.py which is for opening sockets, creating locks and sleeping. aiobotocore also needs subprocess support

graingert avatar Jan 18 '21 14:01 graingert

@emcpow2 httpx and encode in general have fully bought into anyio now: https://github.com/encode/httpcore/blob/f0d16e9462910c13d3b72039223346a3fe8a3299/CHANGELOG.md#fixed-2

graingert avatar Jul 23 '21 22:07 graingert

I'm also really keen to use aiobotocore with Trio - and it looks like there's a reasonable two-step process:

  1. Switch from aiohttp to httpx as our http client; these days it uses anyio internally and thus runs natively on top of asyncio ortrio. The only changes this would require to user code are in exception handling, if someone is currently catching aiohttp-specific exceptions (likely!). A backwards-compatibility layer would be annoying but possible.

  2. Switch from asyncio to anyio. This may require a few API changes to support structured concurrency, and would generally involve somewhat deeper changes to somewhat less code.

The best way forward is probably for someone to draft a PR for (1), and then we can decide whether it's easier to do the two parts together or separately.

Zac-HD avatar Aug 29 '23 23:08 Zac-HD

yea, we've been internally switching to httpx for some of jobs because aiohttp keeps giving random https payload errors. I'm all for this now

thehesiod avatar Aug 30 '23 00:08 thehesiod

it's easier now with the refactoring botocore has done and we've incorporated

thehesiod avatar Aug 30 '23 00:08 thehesiod

After working on implementing httpx this is likely going to cause quite a few changes in user code, and some functionality would be straight up lost due to httpx not having the exact same feature set as aiohttp - e.g. https://github.com/encode/httpx/discussions/2211

In general the code feels quite tightly coupled to aiohttp, such that it would look very different if it'd been written for httpx from the start, and even where it would be possible in theory to create a seamless transition for end users it'd lead to very clunky code. E.g. I'm starting to want to completely drop response.StreamingBody and directly hand the httpx.Response object to the user: https://github.com/aio-libs/aiobotocore/pull/1085#issuecomment-1929749830

So I'm starting to lean towards it being easiest initially to implement httpx alongside aiohttp, and you can specify which one to use upon initialization of the session. This way I don't have to initially bother with full support for niche stuff like proxies, ssl certificates, etc etc (which in several cases doesn't even seem to have tests, which makes it very hard for me to guarantee functionality isn't changed) and can much more easily break it up into multiple PRs. And then when the httpx implementation is done deprecation warnings can be added to aiohttp-specific stuff.

jakkdl avatar Feb 06 '24 14:02 jakkdl