feat: add TCP keepalive for MySQL and PostgresSQL.
BREAK CHANGE: [sqlx_core::net::socket::connect_tcp]. New parameter added.
Add TCP keepalive configuration which could be enabled by [PgConnectOptions::tcp_keep_alive] and [MySqlConnectOptions::tcp_keep_alive].
Does your PR solve an issue?
fixes #3540
Should I re-export TcpKeepalive in socket2 or give a definition on our own?
After some thought, I'm not entirely sure how much value this actually has.
As I worked out in https://github.com/estuary/flow/issues/1676#issuecomment-2392077544, keepalive likely won't solve the original reporter's problem.
It could be useful for other reasons, such as keeping connections from timing out, or catching when a server disconnected abruptly without sending a FIN packet.
However, because the connection state is managed directly, we won't see a keepalive timeout until the next time we try to read from or write to the socket, at which point we're already trying to use it anyway.
I suppose that's still preferable to it hanging forever on a read that will never complete, though.
I think this logic is required for pgnotify connections to recover after the connection is somehow broken. Could you take another look at this one @abonander to see what is needed to get this merged? Thanks!
Catching situations where a server disconnects abruptly without sending a FIN packet is a real-world use case for us.
Our application subscribes to PostgreSQL notifications via database triggers. After an infrastructure issue, we observed that the application silently stopped receiving notifications while the process itself remained running and unaware of any connection problem.
I was able to reproduce this by using iptables rules to drop all packets from the database server. In this scenario, the PgListener continues to wait indefinitely and never detects that the connection is broken. I believe a similar situation would occur if the database host is physically disconnected from the network: no FIN or RST is sent, so the client never gets notified that the connection is actually gone.
When I applied the LD_PRELOAD-based keepalive workaround described here:
https://www.redpill-linpro.com/techblog/2024/12/17/failovers-and-keepalive.html
the PgListener was able to detect the dead connection and recover correctly.
However, relying on the LD_PRELOAD trick is hacky and has undesirable side effects: it affects all TCP connections from the process, not just the database connections.
It would be very helpful if sqlx exposed dedicated TCP keepalive configuration for its database connections, so we can:
- Enable keepalive explicitly for DB connections only.
- Tune keepalive parameters (e.g., idle time, interval, probe count) to match our failover and detection requirements.
This would allow PgListener (and other long-lived connections) to reliably detect broken connections in scenarios where the server disappears without a clean shutdown.