libpcap
libpcap copied to clipboard
Set keepalives on rpcap's control socket
Would you be interested in the following extension of libpcap's API? It builds up on #772 and closes #766 (but doesn't hard-code anything in libpcap and leave it to the users, which seems to be a much more sensible approach).
Since #772 now verifies the integrity of the TCP control connection, optionally allowing keep-alives on it will allow to more efficiently react on a broken network connectivity (the data being transported over UDP or TCP). If you prefer this commit to follow #772, feel free to close this pull request (but #772 is useful on its own for UDP-based data transport when the network is regarded as perfect).
On Linux, it seems the first keep-alive is sent after 2 hours of idling and a broken connection will be detected 20 minutes later (from man 7 tcp), if and only if SO_KEEPALIVE is set, so a libpcap application retrieving packets from a remote rpcapd may be stuck for quite a long while.
The usage is quite simple: pcap_set_control_keepalive (happy to get suggestions about a better name) needs to be called after a successful pcap_open with some values (for example, something more aggressive but still reasonable would be pcap_set_control_keepalive(handle, 1, 5, 15, 5) which would wait 15 seconds of idle time before sending a keep-alive, dropping the connection after 5 missed keep-alives each one spaced by 5 seconds).
The keep-alives settings probably only works on Linux.
If the idea is validated, I'll push another commit for the corresponding man pages.
Regardless of the other points, TCP keep-alives date back to at least 1989, and Linux isn't the only OS that implements them in present-day Internet.
Yup, I only have a Linux for now, but I guess I could probably check the API for *BSD and Windows.
I was more interested in getting your feedback on the basic idea than to implement something exhaustive if I must rewrite most of it because it doesn't fit libpcap's API.
How do you think, where would a custom keep-alive interval be used: at the client side or the server side or both?
I'm not part of the project anymore, but @fjolliton might want to hear about that.
Rebased on the current master branch. Any objections to merging this change as it is?
As it turns out, this code needs some amendments to build on OSes other than Linux.
We no longer have pcap_snprintf() in the trunk; just use snprintf(), as we only support building on platforms (OS+compiler) that support it.
C:\projects\libpcap\pcap-rpcap.c(3523): warning C4133: 'function': incompatible types - from 'int *' to 'const char *'
That seems to mean that setsockopt() signature is different on Windows.
Except Windows, this change now looks much better CI-wise. When the platform does not support setting keepalive, would it be more appropriate to have the error compile-time instead of run-time?
APIs that affect contents of the private data pointed to by the priv member of a pcap_t should not be called directly - they should be called through a pointer in the pcap_t, as their behavior is module-dependent.
You'll have to add a new set_control_keepalive_op to the "More methods." section of the pcap_t structure in pcap-int.h.
Then add a new PCAP_WARNING_CONTROL_KEEPALIVE_NOTSUP error code to the list of warning codes in pcap/pcap.h.
Then, add a pcap_set_control_keepalive_not_initialized() routine to pcap.c that looks something like
static int
pcap_set_control_keepalive_not_initialized(pcap_t *p, int enable _U_, int keepcnt _U, int keepidle _U_, int keepintvl _U_)
{
if (p->activated) {
/* Not set by the module, so probably not supported */
return PCAP_WARNING_CONTROL_KEEPALIVE_NOTSUP;
}
/* in case the caller doesn't check for PCAP_ERROR_NOT_ACTIVATED */
(void)snprintf(pcap->errbuf, sizeof(pcap->errbuf),
"This handle hasn't been activated yet");
}
and set set_control_keepalive_op to pcap_set_control_keepalive_not_initialized in initialize_ops() in pcap.c.
Then have add a int pcap_set_control_keepalive() to pcap.c that looks something like
int
pcap_set_control_keepalive(pcap_t *p, int enable, int keepcnt, int keepidle, int keepintvl)
{
return (p-> set_control_keepalive_op(p, enable, keepcnt, keepidle, keepintvl);
}
Then rename pcap_set_control_keepalive() in pcap-rpcap.c to pcap_set_control_keepalive_rpcap() and, in pcap_open_rpcap(), set fp-> set_control_keepalive_op to pcap_set_control_keepalive_rpcap.
Finally, add a message string for PCAP_WARNING_CONTROL_KEEPALIVE_NOTSUP to pcap_statustostr().
That way, if somebody calls pcap_set_control_keepalive() on a pcap_t that doesn't refer to an rpcap session, it'll return PCAP_WARNING_CONTROL_KEEPALIVE_NOTSUP, which the application can report or can just ignore.
(Also, make pcap_set_control_keepalive_rpcap() return PCAP_ERROR, which is #defined to -1, rather than -1.)
Thank you for the detailed comments, clearly this change is far from being ready for merging. If nobody volunteers to amend it as described, I might try doing that later.
@ether42, do you prefer to complete this change yourself or to let @GabrielGanne do it?
@ether42, do you prefer to complete this change yourself or to let @GabrielGanne do it?
As explained in https://github.com/the-tcpdump-group/libpcap/pull/773#issuecomment-687184028 I'm not working on this project anymore so anyone can take over.
Thank you for confirming, closing this pull request then. Please excuse us for not being able to work on it earlier. @GabrielGanne, please address Guy Harris feedback above in your version of these changes.