httpcore icon indicating copy to clipboard operation
httpcore copied to clipboard

Connection pool optimization: move socket polling from expiry checks to connection usage

Open MarkusSintonen opened this issue 1 year ago • 4 comments

Summary

(Split from original PR here https://github.com/encode/httpcore/pull/924 but this is more refined)

Connection pool implementation is heavily doing socket polling via AsyncConnectionInterface.has_expired(). This check in turn does AsyncNetworkStream.get_extra_info("is_readable") which finally does select.poll(). This check quickly starts to add up when doing many requests through httpcore with great concurrency. Many connection sockets in the pool are constantly polled over and over again when doing requests and also when request closes.

This PR instead moved the socket polling into AsyncHTTP11Connection.handle_async_request. Here the readable-IDLE connection raises ConnectionNotAvailable which makes the request to choose a next connection from the pool. This causes lot less socket polling as now its not done every single time for all connections when pool connections are assigned. Broken connections are still properly maintained as they are still removed from the pool.

This approach is very similar on how urllib3 is validating the health of the connection coming from the pool. (This check finally uses wait_for_read which uses similar socket polling as httpcore)

Adds some previously missing tests.

Benchmark shows how the heavy socket logic causes over 4.6x overhead.

In master with async requests: old

PR with async: new_expiry

With synchronous code the overhead is not as large. There the connection pools socket polling causes 1.6x overhead (maybe the overhead is lower due to sync concurrency using threads where the socket polling IO is not in GIL). Rest of overhead in sync is coming from the connection pools maintenance loops that is fixed here.

In master with sync code: old_sync

PR with sync: new_expiry_sync

Trio-based backend is also affected by this. There the overhead is at similar levels of about 5x as was seen here.

Checklist

  • [x] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • [x] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • [x] I've updated the documentation accordingly.

MarkusSintonen avatar Jun 13 '24 18:06 MarkusSintonen

@T-256 I fixed this PR to drop the interval based socket polling approach as you suggested here.

Instead this is now using a very similar approach as in urllib3 where the socket health is validated at the connection usage time when its taken from the pool. In oppose of polling sockets heavily for all the connections in the pool every single time requests are done and closed.

Even with the polling on usage, performance is much better than previous one.

MarkusSintonen avatar Jun 15 '24 17:06 MarkusSintonen

@MarkusSintonen Do you mind/working to send PR for re-adding asyncio backend and drop anyio backend? I'm also have interest on those works.

T-256 avatar Jun 15 '24 21:06 T-256

@MarkusSintonen Do you mind/working to send PR for re-adding asyncio backend and drop anyio backend? I'm also have interest on those works.

Sure, I was wishing on getting the synchronization PR first approved/merged before opening it. But I can open it already.

MarkusSintonen avatar Jun 16 '24 06:06 MarkusSintonen

@MarkusSintonen Do you mind/working to send PR for re-adding asyncio backend and drop anyio backend? I'm also have interest on those works.

Its now here: https://github.com/encode/httpcore/pull/930

MarkusSintonen avatar Jun 16 '24 18:06 MarkusSintonen

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 26 '25 04:04 stale[bot]