rec: Ponder logging a warning when the `max-tcp-clients` limit is reached
- Program: Recursor
- Issue type: Feature request
Short description
When max-tcp-clients is reached the recursor stops accepting new TCP connections, leaving them rot in the OS TCP listen queue (tcp-listen-overflows increases). We have no metrics about it (tcp-listen-overflows is global to the system), and as far as I can tell nothing is written to the logs either.
It would be nice to:
- increase a counter when the limit is reached
- log a warning (rate-limited, if possible, to prevent flooding the logs)
- consider raising the default value, as 128 TCP connections feels very low these days and the recursor should be able to deal with a lot of them since it now has threads dedicated to TCP handling.
No counter, but a metrics has been implemented in #14606
Reopenened, pondering raising the limit.
@rgacogne Do you have an opinion on what would be a good default limit?
Ideally, I would like something like 1000 concurrent incoming TCP connections by default, but that might be risky. A conservative approach might be to look at how many file descriptors (the actual limiting factor in most cases) we have left once we have accounted for the other file descriptor users, and set the limit to the std::min(max-tcp-clients, remaining file descriptors), or something like that.
Note that Bind still uses 150 by default, Knot (auth, couldn't find the value for Knot Resolver) uses "one half of the file descriptor limit for the server process", and Unbound seems to allocate 10 connections per worker thread.
It would complicate checkOrFixFDS, but I'll take a look. Good moment to double check its logic.
Fixed by #14838