librdkafka
librdkafka copied to clipboard
Consider supporting per-socket TCP Keep-Alive settings: `TCP_KEEPIDLE`, `TCP_KEEPINTVL`, `TCP_KEEPCNT`
Description
First of all, thanks for the existing support of TCP Keep-Alive through the socket.keepalive.enable
configuration.
I am in a similar position to those here: https://github.com/edenhill/librdkafka/issues/3109, and I'm struggling to get my producers to retain persistent connections once they go idle.
I'd like to request additional support for the TCP_KEEPIDLE
, TCP_KEEPINTVL
, and TCP_KEEPCNT
setsockopt
options, which are supported on many (most?) operating systems, and enable the keep-alive to be tuned on a per-socket basis.
Note that IPPROTO_TCP
must be used instead of SOL_SOCKET
when calling setsockopt
with these options.
-
TCP_KEEPIDLE
means "How long must the socket be idle before we start sending keep-alive probes" -
TCP_KEEPINTVL
means "How long between each keep-alive probe" -
TCP_KEEPCNT
means "After how many unacknowledged probes will we consider the connection dead"
Background
Currently, users of socket keepalive must rely on their operating system to tune the parameters of the keep-alive.
Linux, for example, has the following sysctl settings:
-
net.ipv4.tcp_keepalive_time
(Same meaning asTCP_KEEPIDLE
) -
net.ipv4.tcp_keepalive_intvl
(Same meaning asTCP_KEEPINTVL
) -
net.ipv4.tcp_keepalive_probes
(Same meaning asTCP_KEEPCNT
)
Using the operating system configurations isn't always ideal:
- Changing the setting can affect other connections on the system that you did not intend on altering
- Changing the setting isn't always possible. Application developers do not always tightly control their deployment environments.
- Changing the setting for a docker/container deployment is not straightforward:
- In Kubernetes, the necessary
sysctl
settings are (usually) considered unsafe, and therefore require you to deploy a privileged container, plus a non-standardkubelet
configuration on your hosts. - Fully managed Kubernetes services, like Amazon's EKS, Azure's AKS, or Google GKE tend to encourage standardised host node images that put you at arm's length from the necessary configuration.
- In Kubernetes, the necessary
- The default for
tcp_keepalive_time
a.k.a.TCP_KEEPIDLE
is set to 2 hours on many configurations, which is far too large for most modern middle-boxes
Platform Support
TCP_KEEPIDLE
, TCP_KEEPINTVL
, and TCP_KEEPCNT
are not part of the POSIX standard for setsockopts
, but they do seem to be fairly broadly supported on most operating systems.
Here's some references to the various docs:
Mac OS
Mac OS supports TCP_KEEPINTVL
and TCP_KEEPCNT
, but uses TCP_KEEPALIVE
in place of TCP_KEEPIDLE
.
https://github.com/apple/darwin-xnu/blob/a1babec6b135d1f35b2590a1990af3c5c5393479/bsd/netinet/tcp.h#L215
Older Windows
WSAIoctl
and SIO_KEEPALIVE_VALS
seem to be supported as far back as Windows 2000, though that API does not have an equivalent of TCP_KEEPCNT
.
Note that these are also set in milliseconds, whereas TCP_KEEPIDLE
, TCP_KEEPINTVL
are usually set in seconds.
I'm struggling to get my producers to retain persistent connections once they go idle.
how long do the connections stay open for after becoming idle? is the issue simply that you need to configure connections.max.idle.ms
on the broker?
For systems that require a low latency on write a keep-alive IMHO is the better solution.
The connections.max.idle.ms
will close the socket, requiring a new TLS handshake on the partition leader node before write.
I agree with @nickwb that allow configure TCP_KEEPIDLE
and TCP_KEEPINTVL
will help a lot.
TCP-level keepalives will not keep the connection alive with load-balancers or the broker's idle timeouts, since they are on the application level and need kafka requests to restart the idle timers, so I'm not sure adding additional TCP keepalive settings will help in practice.
The Kafka protocol, unfortunately, does not provide a "keepalive" request.