librdkafka
librdkafka copied to clipboard
Consider supporting per-socket TCP Keep-Alive settings: `TCP_KEEPIDLE`, `TCP_KEEPINTVL`, `TCP_KEEPCNT`
Description
First of all, thanks for the existing support of TCP Keep-Alive through the socket.keepalive.enable configuration.
I am in a similar position to those here: https://github.com/edenhill/librdkafka/issues/3109, and I'm struggling to get my producers to retain persistent connections once they go idle.
I'd like to request additional support for the TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT setsockopt options, which are supported on many (most?) operating systems, and enable the keep-alive to be tuned on a per-socket basis.
Note that IPPROTO_TCP must be used instead of SOL_SOCKET when calling setsockopt with these options.
TCP_KEEPIDLEmeans "How long must the socket be idle before we start sending keep-alive probes"TCP_KEEPINTVLmeans "How long between each keep-alive probe"TCP_KEEPCNTmeans "After how many unacknowledged probes will we consider the connection dead"
Background
Currently, users of socket keepalive must rely on their operating system to tune the parameters of the keep-alive.
Linux, for example, has the following sysctl settings:
net.ipv4.tcp_keepalive_time(Same meaning asTCP_KEEPIDLE)net.ipv4.tcp_keepalive_intvl(Same meaning asTCP_KEEPINTVL)net.ipv4.tcp_keepalive_probes(Same meaning asTCP_KEEPCNT)
Using the operating system configurations isn't always ideal:
- Changing the setting can affect other connections on the system that you did not intend on altering
- Changing the setting isn't always possible. Application developers do not always tightly control their deployment environments.
- Changing the setting for a docker/container deployment is not straightforward:
- In Kubernetes, the necessary
sysctlsettings are (usually) considered unsafe, and therefore require you to deploy a privileged container, plus a non-standardkubeletconfiguration on your hosts. - Fully managed Kubernetes services, like Amazon's EKS, Azure's AKS, or Google GKE tend to encourage standardised host node images that put you at arm's length from the necessary configuration.
- In Kubernetes, the necessary
- The default for
tcp_keepalive_timea.k.a.TCP_KEEPIDLEis set to 2 hours on many configurations, which is far too large for most modern middle-boxes
Platform Support
TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT are not part of the POSIX standard for setsockopts, but they do seem to be fairly broadly supported on most operating systems.
Here's some references to the various docs:
Mac OS
Mac OS supports TCP_KEEPINTVL and TCP_KEEPCNT, but uses TCP_KEEPALIVE in place of TCP_KEEPIDLE.
https://github.com/apple/darwin-xnu/blob/a1babec6b135d1f35b2590a1990af3c5c5393479/bsd/netinet/tcp.h#L215
Older Windows
WSAIoctl and SIO_KEEPALIVE_VALS seem to be supported as far back as Windows 2000, though that API does not have an equivalent of TCP_KEEPCNT.
Note that these are also set in milliseconds, whereas TCP_KEEPIDLE, TCP_KEEPINTVL are usually set in seconds.
I'm struggling to get my producers to retain persistent connections once they go idle.
how long do the connections stay open for after becoming idle? is the issue simply that you need to configure connections.max.idle.ms on the broker?
For systems that require a low latency on write a keep-alive IMHO is the better solution.
The connections.max.idle.ms will close the socket, requiring a new TLS handshake on the partition leader node before write.
I agree with @nickwb that allow configure TCP_KEEPIDLE and TCP_KEEPINTVL will help a lot.
TCP-level keepalives will not keep the connection alive with load-balancers or the broker's idle timeouts, since they are on the application level and need kafka requests to restart the idle timers, so I'm not sure adding additional TCP keepalive settings will help in practice.
The Kafka protocol, unfortunately, does not provide a "keepalive" request.