paho.mqtt.c icon indicating copy to clipboard operation
paho.mqtt.c copied to clipboard

(Linux) CLOCK_MONOTONIC to be replaced by CLOCK_BOOTTIME

Open Pascal-Fremaux opened this issue 3 years ago • 8 comments

What do you think of the following ?

Is your feature request related to a problem? Please describe. Seems that the MQTT PING can be blocked on embedded platform that goes to suspend (Linux Kernel 3.x). And may be not restarting after (cannot say more as the reporter has gone in vacation without more information...) Might be consequence of change in https://github.com/eclipse/paho.mqtt.c/issues/920

Describe the solution you'd like Replace usage of CLOCK_MONOTONIC by CLOCK_BOOTTIME. It seems to work for us on initial tests.

Describe alternatives you've considered

Additional context Linux CLOCK_MONOTONIC is affected by the SUSPEND time ; not CLOCK_BOOTTIME

(Kernel 4.17 would have unified the 2 clocks, but the change was reverted later, unknown status. See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6ed449afdb38f89a7b38ec50e367559e1b8f71f; revert: https://www.spinics.net/lists/linux-tip-commits/msg43709.html. And discussion https://github.com/golang/go/issues/24595)

The MONOTONIC clock is not fast forwarded by the time spent in suspend on resume. This is only done for the BOOTTIME clock.

(concerns only the MqttTime)

Pascal-Fremaux avatar Apr 26 '21 09:04 Pascal-Fremaux

Probably need to have a compile option to fall back to MONOTONIC/REALTIME in case BOOTTIME is not available - someone is going to use a really old Linux, or it won't work in a particular environment and I can't test them all obviously.

The MacOS implementations will need to be separated out if BOOTTIME doesn't exist there.

icraggs avatar Apr 27 '21 14:04 icraggs

I removed the comment concerning thread.c as it is probably wrong for the same reasons as the current comment, and by the way I did not noticed it was not used on Linux (cause Linux defines USE_TRYWAIT above).

CLOCK_BOOTTIME exists for Linux only for more than 10 years since Linux 2.6.39 (https://linux.die.net/man/2/clock_gettime, https://lwn.net/Articles/420142/)

Pascal-Fremaux avatar Apr 27 '21 15:04 Pascal-Fremaux

Not a Mac specialist, but they seem to use a different API https://opensource.apple.com/source/Libc/Libc-1158.1.2/gen/clock_gettime.3.auto.html

where CLOCK_MONOTONIC is equivalent to Linux CLOCK_BOOTTIME (not impacted by suspend) and CLOCK_UPTIME_RAW is equivalent to Linux CLOCK_MONOTONIC (stop the count on suspend)

Pascal-Fremaux avatar Apr 27 '21 15:04 Pascal-Fremaux

So as long as the Mac is left as is, and preferably old Linuxes can be catered for with "#if defined" it sounds good to me. Are you going to open a PR?

icraggs avatar Apr 28 '21 13:04 icraggs

Not for the moment, but we will test the change for CLOCK_BOOTTIME on embedded Linux 3.x (and Android 10) and I may give a feedback later to say if it is valid solution or not for those platforms.

Pascal-Fremaux avatar Apr 28 '21 14:04 Pascal-Fremaux

It might be nice to make this a CMake build flag, or maybe collect a few legacy (err, I mean "classic") options behind one flag, if we're worried about polluting the build.

I'll also do some quick tests, as I do maintain a few embedded Linux projects that are pretty old. But if it works as expected, this would be really helpful on battery-based systems.

fpagliughi avatar Apr 28 '21 15:04 fpagliughi

If we can get away with #ifdefs for Linux + kernel version rather than a build option, I'd prefer that. As this option appears to be non-POSIX then I think we need to default to CLOCK_MONOTONIC outside of Linux >= 2.6.39

icraggs avatar Apr 28 '21 17:04 icraggs

A bit late but now I can report 6 months without issue when our platform goes to suspend with this patch

diff --git a/src/MQTTTime.c b/src/MQTTTime.c
index 9afab0245c11c18992be8cfa53e88d9c883132e0..d05cc8d3f4e52f98eb5b92f9ab5694908fa5d6fe 100644
--- a/src/MQTTTime.c
+++ b/src/MQTTTime.c
@@ -48,7 +48,7 @@ START_TIME_TYPE MQTTTime_start_clock(void)
 START_TIME_TYPE MQTTTime_start_clock(void)
 {
 	struct timespec start;
-	clock_gettime(CLOCK_MONOTONIC, &start);
+	clock_gettime(CLOCK_BOOTTIME, &start);
 	return start;
 }
 #else
@@ -57,7 +57,7 @@ START_TIME_TYPE MQTTTime_start_clock(void)
 	struct timeval start;
 	struct timespec start_ts;
 
-	clock_gettime(CLOCK_MONOTONIC, &start_ts);
+	clock_gettime(CLOCK_BOOTTIME, &start_ts);
 	start.tv_sec = start_ts.tv_sec;
 	start.tv_usec = start_ts.tv_nsec / 1000;
 	return start;
diff --git a/src/Thread.c b/src/Thread.c
index b8f789d6ecfd92011e42f02bcd63a69984c8a627..47c11e40a9db237362fae924f62cb8439e60d86c 100644
--- a/src/Thread.c
+++ b/src/Thread.c
@@ -263,7 +263,7 @@ int Thread_wait_sem(sem_type sem, int timeout)
 		 * Does this make it susceptible to system clock changes?
 		 * The intervals are small enough, and repeated, that I think it's not an issue.
 		 */
-		if (clock_gettime(CLOCK_REALTIME, &ts) != -1)
+		if (clock_gettime(CLOCK_BOOTTIME, &ts) != -1)
 		{
 			ts.tv_sec += timeout;
 			rc = sem_timedwait(sem, &ts);

Pascal-Fremaux avatar Nov 23 '21 14:11 Pascal-Fremaux