Alternative simplified HTTP client API
As discussed with @asvetlov, here is a proposal for an additional API to make HTTP requests, in two parts:
aiohttp.client.HttpClientSession: an API with reasonable performances, and simpler semantics thanaiohttp.ClientSession. This is what most small to average programs will want to use. The most important feature being the preloading of the request body, but it doesn't support WebSocket.aiohttp.client.requests: an API with weak performances, but very easy to play with, that will remind the hello world from therequestslibrary. This is what we can put in the README to ease people into the transition into the async world. It's also very nice to use in a shell or quick and dirty scripts.
The existing ClientSession API is not going to be deprecated, just presented in the documentation as a more advanced use case for when you want to get the best performances and control possible. The two other API in fact uses ClientSession under the hood.
The README and the documentation will reflect this, and will clearly mention the pros and cons of each API, with links to the other. In order to avoid overwhelming the new comer, and to also avoid confusion ("why 3 ways of doing the same thing ?"), we will not present them at the same time.
It should be progressive:
1 - README:
Demonstrates aiohttp.client.requests as a hello world, then leads with a phrase like:
"Learn more about those simple requests<doc>. Want better performances than this ? Read about about the more efficent alternatives<doc>."
2 - The page "Learn more about those simple requests" leads to the requests API tutorial. At the end of this tutorial, a mention is made about the pros and cons of this API, and a link to the aiohttp.client.HttpClientSession API tutorial is offered.
3 - At the end of the aiohttp.client.HttpClientSession API tutorial, a mention is made about the pros and cons of this API, and a link to the aiohttp.ClientSession API tutorial is offered.
4 - Read about about the more efficent alternatives<doc>_. explains the pros and cons of each API, and link to both the aiohttp.client.HttpClientSession and the aiohttp.ClientSession API tutorial. It will recommand to give a try to aiohttp.client.HttpClientSession, and to choose aiohttp.ClientSession only if performances, control or WebSocket are important to the user.
API design
Look and feel
Current ClientSession API
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get('http://httpbin.org/get') as resp:
print(resp.status)
print(await resp.text())
New aiohttp.client.requests API
from aiohttp.client import requests
resp = await requests.get('http://httpbin.org/get') # open and close everything
print(resp.status)
print(resp.text) # response body is preloaded
New aiohttp.client.HTTPClientSession API
from aiohttp.client import HTTPClientSession
async with HTTPClientSession() as session: # Same params as ClientSession
resp = await session.get('http://httpbin.org/get') # client keeps a ref on each response object
print(resp.status)
print(resp.text) # response body is preloaded
# close all referenced response objects
Differences with the ClientSession API
HTTPClientSession
HTTPClientSession is similar to ClientSession, except:
- remove
connector_ownerparam: sharing a connector is an advance use case, leave it to the advanced API. - add
_responses_historyattribute: this will hold a reference to all created responses and will be looped over for cleaning onHTTPClientSession.close(). - remove
ws_response_classparam:HTTPClientSessiontargets HTTP only. For websocket, use the advanced API. - calling
ws_connect()raisesClientTypeError("HTTPClientSession does not support WebSocket. Use ClientSession instead.")as defined by:
class ClientTypeError(TypeError, aiohttp.ClientError):
pass
HTTPClientSession.request/get/post/... and requests.get/post/...
Same as ClientSession.request/get/post..., except:
- remove
chunkedparam: body will be preloaded, so it doesn't make sense. - add
content_length_limitparam: if content length is set too high, read only up to this limit as a safety mechanism. This is necesarry sinceread()is automatically called. Do you think I should set it to None by default or just a very high value ? Should it raise an exception is limit is reached ? - removed
read_until_eofparam: always read up to the "Content-Length" header orcontent_length_limit - return a
PreloadedClientResponseobject instead of aClientResponse. - PreloadedClientResponse inherits from
ClientResponse. ClientResponse.read()immediatly called onPreloadedClientResponseprior return.HTTPClientSession.contentis not aaiohttp.StreamReaderbutbytes.PreloadedClientResponse.read()raisesResponseTypeError("PreloadedClientResponse content is automatically preloaded, don't call read(), access the 'content' and 'text' attributes.")as defined by:
class ResponseTypeError(TypeError, aiohttp.ClientResponseError):
pass
PreloadedClientResponse.textis not a coroutine, but a@propertythat lazily decodes content on first read, then is replaced by a regular attribute containing the result for subsequent reads. Do you have utils to do that in the code base ?- add a
PreloadedClientResponse.encodingattribute, preloaded at__init__after.read(). Can be set manually by the user to compensate for the facttextis not a regular function anymore and you can't pass it params. - add
encoding_errorsparam: accept "errors", "replace", etc., stored inencoding_errorsattribute. Compensate for the facttextis not a regular function anymore and you can't pass param. (This could be a good addition to ClientSession as well, in case one wants the same strat for all requests instead of passing it to each call)
HTTPClientSession inherits from ClientSession.
requests will in fact be a singleton (just instanciated at the root of the module, no fancy __new__ tricks), and an instance of BasicHTTPClient. BasicHTTPClient will not inherit from ClientSession, but will use composition to delegate the work to HTTPClientSession and only expose .get/post/etc
Testing
I'll scan test_client_response.py and test_client_session.py for inspiration (espacially for the mocking), but I don't intend to duplicate the tests in there.
For HTTPClientSession, I'll create:
- test_preloaded_client_response.py
- test_http_client_session.py
And only tests the API differences with their parent, plus the signature, the behavior of overriden methods and ensure closing of the response objects.
For requests, I'll create test_requests.py and test the entire high level API, as well as use mocks to ensure we are calling the cleaning code correctly.
A few things.
Sorry for possible misleading, but the aiohttp.client already exists (while it is not promoted as a public API).
We need another name.
Since it is for simplified usage please consider aiohttp.nutshell. I'm not an English speaker though. Maybe @mjpieters can suggest something? Looks like he is good in a words juggling :)
If we pin the aiohttp.nutshell namespace, I see all new APIs in this subpackage. HTTPClientSession is cumbersome, from aiohttp import nutshell; nutshell.Client looks better.
There is no need for nested requests subname, resp = await nutshell.get() looks good enough as well.
from aiohttp import nutshell
resp = await nutshell.get('http://httpbin.org/get') # open and close everything
print(resp.status)
print(resp.text) # response body is preloaded
and
from aiohttp import nutshell
async with nutshell.Client() as client:
resp = await client.get('http://httpbin.org/get')
print(resp.status)
print(resp.text) # response body is preloaded
Since nutshell.Response is a pure dataclass (attrs class actually because we use attrs already and should support Python 3.6) without IO there is no need for client._responses_history; all responses are released before an exit from await client.get(...) and family.
Websockets are not supported, at least for the first version -- I agree.
I suggest starting from the very minimalistic API, extend the list of supported features incrementally.
Say, I'm not happy with the proposal of changing the type of resp.content attribute. It looks very confusing. Bytes mode needs more attention. Maybe the simplified mode doesn't require it at all.
resp.read() should go, let's stick with resp.text property.
Our intention should be not requests API copy, it is a not realistic target. Let's design the simplistic API from scratch.
BasicHTTPClient is not a thing. Singletons don't work with asyncio well. I can imagine nutshell.get() as a simple shortcut for async with nutshell.Client() as client; return await client.get(...).
Simple and elegant for the performance price :)
Regarding testing: please don't use mocking at all, it is not required for the discussed API. Take ./tests/test_client_functional.py for inspiration.
A few things. Sorry for possible misleading, but the aiohttp.client already exists (while it is not promoted as a public API). We need another name.
Since it is for simplified usage please consider aiohttp.nutshell. I'm not an English speaker though. Maybe @mjpieters can suggest something? Looks like he is good in a words juggling :)
Got it. My preference usually goes to something close to the domain, though. So I'd suggest "requests" or "http":
from aiohttp import http
resp = await http.get('http://httpbin.org/get') # open and close everything
async with http.Client() as client:
Or:
from aiohttp import requests
resp = await requests.get('http://httpbin.org/get') # open and close everything
async with requests.Client() as client:
But honestly I can work with any name, just chose the one you like.
Since nutshell.Response is a pure dataclass (attrs class actually because we use attrs already and should support Python 3.6) without IO there is no need for client._responses_history; all responses are released before an exit from await client.get(...) and family.
I was thinking that ".text" could lazily read and close the connection, as to mitigate the perf cost we already pays. But in retrospect, this will cost more memory and make the code more complex and harder to test. attrs class it is.
I suggest starting from the very minimalistic API, extend the list of supported features incrementally. Say, I'm not happy with the proposal of changing the type of resp.content attribute. It looks very confusing. Bytes mode needs more attention. Maybe the simplified mode doesn't require it at all. resp.read() should go, let's stick with resp.text property.
Alright, but I need to store the bytes somewhere: any request to an image, a zip or an excel file is a legit case for the simplified API, but we won't decode that as text. Also, I expect json() will run faster if it is fed from bytes, not str. Given I'll have it anyway, can I just publicly expose it, as a "payload" or a "bytes" attribute ?
Our intention should be not requests API copy, it is a not realistic target. Let's design the simplistic API from scratch.
Yes.
BasicHTTPClient is not a thing. Singletons don't work with asyncio well. I can imagine nutshell.get() as a simple shortcut for async with nutshell.Client() as client; return await client.get(...). Simple and elegant for the performance price :)
Ok. I'll make it a functional API a the root of the module. No need for a class.
Regarding testing: please don't use mocking at all, it is not required for the discussed API. Take ./tests/test_client_functional.py for inspiration.
Thank you.
Let's wait for a while, maybe the best name will come up.
aiohttp.http module already exists as aiohttp.client.
aiohttp.requests sounds like requests library in aiohttp which is not true. This fact embarasses me a little.
I just publicly expose it, as a "payload" or a "bytes" attribute ?
Let's use bytes, resp.text and resp.bytes look natural. Plus resp.json() function, right?
I do like nutshell; async HTTP requests in a nutshell. It neatly conveys that this is the digest reader version of the full API. The downside is that those building a server may ask for a nutshell simplification of the server APIs too.
My preference usually goes to something close to the domain, though. So I'd suggest "requests" or "http":
But the top-level namespace is already aiohttp. aiohttp.http feels rather redundant, aiohttp.requests both feels like either a copy of the requests project and at the same time not clearly defining why it is different from aiohttp.client (most developers don't know nor care that requests is a friendly wrapper around urllib3). If you want a literal name, go for aiohttp.simple.
But I have a soft spot for more cutesy names like nutshell.
aiohttp.simple was number two on my list.
I proposed nutshell exactly for the reason @mjpieters mentioned; async HTTP requests in a nutshell sounds like a motto :)
Cutesie names always sound cool to the authors, but if you open and stumble uppon an unknow code base with it, it gives zero information of what it does. It's the same problem with "nurseries" in trio and why there is an entire thread still open about it : https://github.com/python-trio/trio/issues/504"simple" is less ambiguous, in fact, it removes ambiguity by anwsering by itself what's so special about this new API. This is good.15:46, 16 novembre 2019, Martijn Pieters [email protected]:I do like nutshell; async HTTP requests in a nutshell. It neatly conveys that this is the digest reader version of the full API. The downside is that those building a server may ask for a nutshell simplification of the server APIs too.
My preference usually goes to something close to the domain, though. So I'd suggest "requests" or "http":
But the top-level namespace is already aiohttp. aiohttp.http feels rather redundant, aiohttp.requests both feels like either a copy of the requests project and at the same time not clearly defining why it is different from aiohttp.client (most developers don't know nor care that requests is a friendly wrapper around urllib3). If you want a literal name, go for aiohttp.simple. But I have a soft spot for more cutesy names like nutshell.
—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe. -- Sent from Yandex.Mail for mobile
Cutesie names have given us Flask and Click and other household names.
But if the name aiohttp.simple was used and the documentation for it starts with a tagline: http requests in a nutshell, that might also work.
Aren't fork or socket little playful names for such serious things?
From my understanding, simple is mostly used as an adjective. I believe the namespace should be named by something that is a noun. Is there a deadly official and serious name that fits our needs?
Also, I recall very many books about programming that are called
Anyway, the name can be changed easily, it is not a show-stopper for starting simplified API. The thing is very useful and important regardless of the naming.
Just to link these issues, I think the new proposed API above is a bad idea, but maybe some shortcut methods for simple cases could be included (e.g. ClientSession.get_text()): https://github.com/aio-libs/aiohttp/issues/5385#issuecomment-1809211479