Connection pool optimization: move socket polling from expiry checks to connection usage
Summary
(Split from original PR here https://github.com/encode/httpcore/pull/924 but this is more refined)
Connection pool implementation is heavily doing socket polling via AsyncConnectionInterface.has_expired(). This check in turn does AsyncNetworkStream.get_extra_info("is_readable") which finally does select.poll(). This check quickly starts to add up when doing many requests through httpcore with great concurrency. Many connection sockets in the pool are constantly polled over and over again when doing requests and also when request closes.
This PR instead moved the socket polling into AsyncHTTP11Connection.handle_async_request. Here the readable-IDLE connection raises ConnectionNotAvailable which makes the request to choose a next connection from the pool. This causes lot less socket polling as now its not done every single time for all connections when pool connections are assigned. Broken connections are still properly maintained as they are still removed from the pool.
This approach is very similar on how urllib3 is validating the health of the connection coming from the pool. (This check finally uses wait_for_read which uses similar socket polling as httpcore)
Adds some previously missing tests.
Benchmark shows how the heavy socket logic causes over 4.6x overhead.
In master with async requests:
PR with async:
With synchronous code the overhead is not as large. There the connection pools socket polling causes 1.6x overhead (maybe the overhead is lower due to sync concurrency using threads where the socket polling IO is not in GIL). Rest of overhead in sync is coming from the connection pools maintenance loops that is fixed here.
In master with sync code:
PR with sync:
Trio-based backend is also affected by this. There the overhead is at similar levels of about 5x as was seen here.
Checklist
- [x] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
- [x] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
- [x] I've updated the documentation accordingly.
@T-256 I fixed this PR to drop the interval based socket polling approach as you suggested here.
Instead this is now using a very similar approach as in urllib3 where the socket health is validated at the connection usage time when its taken from the pool. In oppose of polling sockets heavily for all the connections in the pool every single time requests are done and closed.
Even with the polling on usage, performance is much better than previous one.
@MarkusSintonen Do you mind/working to send PR for re-adding asyncio backend and drop anyio backend? I'm also have interest on those works.
@MarkusSintonen Do you mind/working to send PR for re-adding
asynciobackend and dropanyiobackend? I'm also have interest on those works.
Sure, I was wishing on getting the synchronization PR first approved/merged before opening it. But I can open it already.
@MarkusSintonen Do you mind/working to send PR for re-adding
asynciobackend and dropanyiobackend? I'm also have interest on those works.
Its now here: https://github.com/encode/httpcore/pull/930
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.