undici icon indicating copy to clipboard operation
undici copied to clipboard

Limit the maximum numbers of open sockets

Open mcollina opened this issue 1 year ago • 17 comments

One of the first issues with "fetch failed" is that we don't limit the number of outgoing sockets, and the process runs out of them, triggering a timeout error.

I think we should add a limit of 127 per pool by default (or any other magic number)

mcollina avatar May 16 '24 12:05 mcollina

The global dispatcher or the overall default Client/Agent settings?

metcoder95 avatar May 16 '24 20:05 metcoder95

Both are good ideas.

mcollina avatar May 16 '24 21:05 mcollina

The Elasticsearch client library (of which I am the current maintainer) currently has an open request from the Kibana team to support limiting the maximum open sockets and max idle sockets when keepalive is enabled. If this functionality is added to Undici, it would be helpful if it exposed options to limit the max open and max total sockets, both globally and per upstream target, as well as max free sockets for keepalive use cases, to provide parity with http.Agent's options:

  • maxFreeSockets: max free sockets when keepalive is enabled
  • maxSockets: max concurrent sockets per origin
  • maxTotalSockets: max concurrent sockets globally

I've not contributed to this library before, but would love to help add this functionality.

JoshMock avatar Aug 28 '24 19:08 JoshMock

Overall, I think this should be already doable (unless I'm overlooking at something).

By using Pool you can limit the number of connections (Client instance) that the pool can have. As a Client is mapped to a single Socket.

Ref: https://undici.nodejs.org/#/docs/api/Pool?id=parameter-pooloptions

The options listed should be ported to the Agent, as it can manage these options at a higher level, at the end, depending on the number of connections passed, can be a Pool of connections or a single connection per origin.

maxFreeSockets - maybe maxIdleConnections? Here the Pool and/or the Agent can be preemptive and destroy idle connections before waiting for more request to come.

maxTotalSockets - these should be ported to Agent I believe, as the Pool already has an option for that.

Would you like to send a PR for it?

metcoder95 avatar Aug 28 '24 20:08 metcoder95

@JoshMock this is already possible at Pool level.

mcollina avatar Aug 31 '24 06:08 mcollina

I struggle to read from the code but what happens if a pool reaches its connection limit. I can see that here we won't create a dispatcher. What are the implications of this. Will the request be dropped or queued?

tdeekens avatar Apr 22 '25 07:04 tdeekens

The request might fail with if it runs of file descriptors AFAIK

gurgunday avatar Apr 22 '25 11:04 gurgunday

So is it recommended to set a limit or rather not to? Assuming I don't set a limit and the OS runs out of sockets the request fails. If I set a premature (maybe too low) limit the request fails as the pool can't create a distpacher?

tdeekens avatar Apr 22 '25 11:04 tdeekens

You should add a pool limit, you won't lose requests if the pool limit is reached

gurgunday avatar Apr 22 '25 21:04 gurgunday

The missing piece of "max open sockets" functionality that is still needed here is parity with maxTotalSockets from http.Agent. When setting connections on an Agent or BalancedPool, it is enforcing it on a per-upstream basis, rather than at the agent/pool level. If you know the exact number of upstreams you're going to have, you could set connections to MAX_CONNS / upstreams.length, but that does not help in the case that an upstream is added, because connections cannot be changed once it is set.

I would be happy to contribute this functionality, but would like to know from the primary maintainers which dispatcher implementations you think are good candidates for adding it this functionality before doing the work.

JoshMock avatar Apr 25 '25 19:04 JoshMock

As an aside: it's very hard to tell the difference between what BalancedPool and Agent do without inspecting the code. By looking at the docs, they appear to do the same thing, except that Agent doesn't expose PoolStats and appears to add new upstreams for you as new origins are seen, rather than enforcing a strict list of origins.

JoshMock avatar Apr 25 '25 19:04 JoshMock

I would be happy to contribute this functionality, but would like to know from the primary maintainers which dispatcher implementations you think are good candidates for adding it this functionality before doing the work.

The Pool is the minimal entity that requires this implementation; from it, further abstractions layers (e.g. Agent and BalancedPool) would need to forward this down level.

metcoder95 avatar Apr 27 '25 09:04 metcoder95

The Pool is the minimal entity that requires this implementation

Pool only supports connecting to one origin, and already exposes connections to limit the number of Clients it can create. So I'm confused how one would implement a feature in Pool that isn't relevant unless you have a collection of Pools, which only happens at higher levels of abstraction.

Would it make more sense to implement this in BalancedPool and/or Agent by comparing the configured number against the sum of the PoolStats.connected value for all of its pools?

JoshMock avatar Apr 28 '25 16:04 JoshMock

Got it, good catch, completely forgot about the connections property.

Would it make more sense to implement this in BalancedPool and/or Agent by comparing the configured number against the sum of the PoolStats.connected value for all of its pools?

Yeah, can be in both

metcoder95 avatar Apr 29 '25 06:04 metcoder95

I have a sample implementation done for BalancedPool here: https://github.com/JoshMock/undici/commit/8a7ec595fbed2a559956b5e6b4b892f386cc6912

The main problem I'm observing with this implementation is that there's a memory leak in my kDispatcher implementation that I don't yet have a solution for. I wrap each handler's onComplete each time a request is dispatched, but it seems that handler is unique on every dispatch, because using a private symbol on the handler to track whether it had already been wrapped did not work. The result is that onComplete gets re-wrapped each time a request is put back on the queue, causing calls to kQueueResume to increase over time in some cases.

Is there a good way to avoid this? Maybe an interceptor would work, but I only see compose being exposed as an end-user feature rather than being used anywhere internally.

Would love advice from @metcoder95 and other maintainers, as well as any other feedback on my current implementation.

JoshMock avatar May 01 '25 20:05 JoshMock

Left a comment in the commit, but I'd suggest to open the PR and we can iterate over it there

metcoder95 avatar May 02 '25 08:05 metcoder95

@metcoder95 bumping this; I've opened #4192 as a draft PR and would love a preliminary review of the work in progress.

JoshMock avatar May 07 '25 17:05 JoshMock

Finally circled back to this and opened https://github.com/nodejs/undici/pull/4365 to add support for limiting open sockets in Agent.

@mcollina @metco Would love your feedback on that! 🖤

JoshMock avatar Jul 30 '25 21:07 JoshMock

With https://github.com/nodejs/undici/pull/4365 ready to merge, supporting a max # of origins in an Agent, do we want to pursue a similar improvement to BalancedPool? My needs for the Elasticsearch client should be met by the Agent change but I'm happy to contribute that if it's desired, since I'm already in that headspace.

JoshMock avatar Aug 04 '25 19:08 JoshMock

sgtm 👍

metcoder95 avatar Aug 05 '25 07:08 metcoder95

Another need I'd like to address is limiting max idle sockets, to provide parity with http.Agent.maxFreeSockets.

The approach:

  • set a maxFree option on Agent at instantiation
  • a drain event is emitted by a client
    • Agent counts the number of free connections across all origins
    • if free >= maxFree, close this connection

I may work on that functionality for Agent before addressing BalancedPool if this approach seems sound. (Also, please let me know if we should track this in a separate issue.)

JoshMock avatar Aug 14 '25 18:08 JoshMock