httpcore icon indicating copy to clipboard operation
httpcore copied to clipboard

Fix 2 bugs and improve performance from O(N²+NxM) to O(N+M) in _assign_requests_to_connections

Open VictorPrins opened this issue 3 months ago • 10 comments

Summary

Fixes 2 logic bugs in _assign_requests_to_connections and improves complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests.

Although I attempted a smaller PR with an atomic change, I ultimately decided for a redesign of the method because the algorithm design of the method was fundamentally flawed.

Motivation

_assign_requests_to_connections is in the hot path of this library and the bugs and complexity issue were severly slowing down my application.

Bug 1: max_keepalive_connections is not respected

This causes idle connections to be evicted unnecessarily. This bug is also addressed in this PR (currently approved but not merged).

The bug is on lines 294-295. The expression on line 294 simply returns len(self._connections), ignoring the number of idle connections, and compares this number to self._max_keepalive_connections. https://github.com/encode/httpcore/blob/5974b03c7df89d3ee4e23779900d5349d550753c/httpcore/_async/connection_pool.py#L294-L295

This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.

Bug 2: multiple requests are assigned to the same HTTP/1.1 connection, leading to ConnectionNotAvailable exceptions

The bug is on lines 322-323. An available connection is assigned to a request, but the status of that connection is not actually changed, resulting in that connection also being assigned to the next request in the queue.

https://github.com/encode/httpcore/blob/5974b03c7df89d3ee4e23779900d5349d550753c/httpcore/_async/connection_pool.py#L322-L323

This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.

Perf: Reduce complexity from quadratic to linear

This PR reduces time complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests. In a typical use case, with for instance 100 connections in the pool and 500 requests in the queue this reduces complexity dramatically, especially since _assign_requests_to_connections is in the hot path of this library because it will be called twice for every request (on start and on finish).

In my use case, this PR reduces the CPU utilization of my application by half. Profiling shows that _assign_requests_to_connections previously accounted for 59% of all samples; after applying this PR, its share dropped to 4%.

The problem: unnecesary nested loops

This snippet contains a nested loop over all connections (O(N²)). This is unnecessary and can be done in a single pass (O(N)).

This snippet contains a nested loops over all requests and connections (O(NxM)). This is unnecessary and can be done in a single pass (O(M)) with early exit if all connections are occupied.

Checklist

  • [x] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • [x] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • [x] I've updated the documentation accordingly.

VictorPrins avatar Sep 14 '25 18:09 VictorPrins

Already tried here the same but httpcore seems to be abandoned https://github.com/encode/httpcore/pull/929

MarkusSintonen avatar Sep 15 '25 15:09 MarkusSintonen

@MarkusSintonen Thanks for replying and for linking that PR. httpcore gets almost 10M downloads a day through the httpx dependency, and usage is still increasing over time. So I hope the project isn't being abandoned. Hopefully we can draw some attention to this.

VictorPrins avatar Sep 15 '25 16:09 VictorPrins

Citing some issues related to this, including in downstream libraries relying on httpcore:

  • https://github.com/encode/httpx/issues/3215
  • https://github.com/BerriAI/litellm/issues/6592
  • https://github.com/openai/openai-python/issues/1596
  • https://github.com/encode/httpx/discussions/3100

VictorPrins avatar Sep 15 '25 17:09 VictorPrins

+1

florisheijmans avatar Sep 16 '25 14:09 florisheijmans

+1

svilupp avatar Sep 17 '25 06:09 svilupp

Good luck, #1000 has been hanging for 6 months, despite being so tiny. The review you saw is not binding, I can't merge it until some maintainer approves it. This project does seem abandoned, I'm afraid.

blazewicz avatar Sep 22 '25 20:09 blazewicz

@blazewicz Nope, just the attention isn't on this PR for a reason... v1 comprehensively addresses this issue and is being quietly worked on. It's worth taking a look at the relative complexity of the connection pool handling there vs. the connection pool handling in the the existing httpcore.

lovelydinosaur avatar Sep 22 '25 22:09 lovelydinosaur

I have just released a new high-perf http library which is fully Rust-reqwest based. Batteries are included, including unit testing support. Go check it out: https://github.com/MarkusSintonen/pyreqwest :)

MarkusSintonen avatar Oct 12 '25 19:10 MarkusSintonen

@MarkusSintonen Seems like an interesting project, tho this space isn't your personal advertising billboard.

lovelydinosaur avatar Oct 12 '25 21:10 lovelydinosaur

Okay I'm going to walk that one back a little bit...

I've gotten a bit fatigued by some interactions in the OSS space. Because reasons. My apologies.

That's a really smart project you've got there. One route we might be able to take would be... would a pyreqwest transport make sense for httpx?

That'd potentially allow a graceful transition for existing deployments to really quickly start taking advantage of that work without having to make API changes to get there.

(I've been considering the same for the v1 work that I've been on...)

lovelydinosaur avatar Oct 13 '25 05:10 lovelydinosaur