exanic-software
exanic-software copied to clipboard
Bug fix for exasock timeout logic
There's a bug in libs/exasock/socket/common.h that corrupts the timeout parameters set on exasock sockets. The following line occurs twice in this file:
const struct timespec *to = (const struct timespec *)&to_val
This is a static cast from a timeval pointer to a timespec pointer.
Here's how timeval is defined in the GNU C library:
struct timeval {
time_t tv_sec; /* Seconds */
long int tv_usec; /* Microseconds */
};
And here's the definition of timespec:
struct timespec {
time_t tv_sec; /* Seconds */
long int tv_nsec; /* Nanoseconds */
};
The two problems with this are:
- Technically, the types of
tv_usec
andtv_nsec
are implementation defined. In the GNU C library, they're the same (long int), but these could be different types altogether in another implementation. - More gravely, one struct represents the time as a (seconds, microseconds) pair, while the other uses (seconds, nanoseconds)! Obviously, you can't simply cast a pointer from one struct to the other and be done with it. But that's what exasock does, today. Socket options like the send timeout (SO_SENDTIMEO) are specified with a timeval instance. Then, to compute the timeout time, exasock "converts" this to a timespec, so a timeout given in microseconds is interpreted as a time in nanoseconds.
The consequence of this bug is that timeouts set to any value less than a second are essentially ignored. For instance, a timeout of 900,000 microseconds (900 milliseconds) is treated like a timeout of 900,000 nanoseconds (900 microseconds). Since the resolution of the CLOCK_MONOTONIC_COARSE system timer is on the order of milliseconds on Linux (e.g., see https://upvoid.com/devblog/2014/05/linux-timers/), a timeout parameter < 1 millisecond has little meaning/predictability.
This pull request suggests a minimal changeset to fix the bug. It has been tested on a Linux system and confirmed correct.
Resolves Issue #84