snapd notices: Ensure that two notices don't have the same lastDateTime

When a client wants to receive the notices using the /v2/notices interface, it usually will first receive all the old notifications, store the date and time of the last notification, and in the next calls use that date and time to get the next notices, and avoid getting the old notices over and over again.

Unfortunately, if, due to chance, two or more notifications have the same date and time, this scheme will fail, because after receiving the first one, the call will return it, and when the client asks for the next, it will filter from the date/time of the last notice sent, so the second notice with the same timestamp won't be sent. This can happen if the timer used for timestamps doesn't have enough granularity.

This patch fixes this by ensuring that no notice has the same date/time value for lastReceived, by adding one or more nanoseconds when required.

Thanks for helping us make a better snapd! Have you signed the license agreement and read the contribution guide?

Apr 11 '24 11:04 sergio-costas

@ernestl Is there a folder for "utility functions" where to put the CompareDates() function? Maybe inside timeutil?

Apr 11 '24 12:04 sergio-costas

CC @benhoyt

Apr 12 '24 20:04 olivercalder

Sorry, I didn't intend to click Approve there, just comment. I realise I'm not an approver around here :-), though I would want those changes I suggested for backporting this to Pebble.

Apr 12 '24 22:04 benhoyt

@benhoyt Well, this came up due to a resolution problem that I assumed was in Go, but in the end it was in the snapd-glib API that I was using to communicate with (GDateTime has one microsecond resolution). But after that, I began to read about timers in the PCs and found an interesting rabbit hole.

TL;DR although since the original Pentium there are timers with very high resolution, and today any recent X86_64 device has timers with nanosecond granularity, this can't be guaranteed with not-so-recent devices or in other architectures, where, under some circumstances, granularity can be as low as tenths or hundredths of microsecond.

The original Pentium added counters that incremented with the processor's clock, so in the original Pentium at 50MHz they resolved to 20ns. Higher clock speeds result in better granularity, and any processor with 1GHz or faster clock would have nanosecond granularity. But when Power Management was added, the clock could vary or even be stopped, having to rely on the HPET, with clock frequency of about 14MHz according to Wikipedia (thus, about 71ns). This was fixed in later models, but then multicore devices arrived, and in those, these counters weren't guaranteed to be synchronized between cores, so software solutions were needed (like a daemon for AMD K8 processors that periodically re-synchronized the counters between cores). Most modern x86_64 processors fixed this problem too, so currently we can have one nanosecond granularity in the timers. ARM seems to have similar counters, but I haven't found much info about them. And, anyway, there are still devices like the Raspberry Pi 3 with 400MHz clock speed that, if I get it correctly, couldn't have better granularity than 2.5ns.

So, since we can't guarantee that we always have 1 nanosecond granularity in the system clock, I think that it's a good idea to add this patch.

Sources: https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps https://en.wikipedia.org/wiki/Time_Stamp_Counter https://0xax.gitbooks.io/linux-insides/content/Timers/linux-timers-6.html https://forums.raspberrypi.com/viewtopic.php?t=133981 https://lkml.org/lkml/2005/11/4/173

Apr 15 '24 08:04 sergio-costas

@olivercalder Mmm... but why would the time in the server "go backwards" by such a big time? I mean: if you are talking about daylight saving, the timezone is also changed, so the nanoseconds will still be monotonically increasing because they should be calculated against UTC, by adding/subtracting the timezone to the local time. The only case where the time goes really backwards is if the internal clock has some drift (which is inevitable), and NTP adjusts it. But that difference would be milliseconds at most, so the RepeatAfter should not be a problem...

Apr 17 '24 07:04 sergio-costas

@olivercalder Mmm... but why would the time in the server "go backwards" by such a big time? I mean: if you are talking about daylight saving, the timezone is also changed, so the nanoseconds will still be monotonically increasing because they should be calculated against UTC, by adding/subtracting the timezone to the local time. The only case where the time goes really backwards is if the internal clock has some drift (which is inevitable), and NTP adjusts it. But that difference would be milliseconds at most, so the RepeatAfter should not be a problem...

@sergio-costas I believe the problem would be if the RepeatAfter time is on the order of the NTP adjustments. I agree this is unlikely, and it's especially unlikely that a notice which one wants to assure will not be missed would have a RepeatAfter time set at all, since that by definition would have the potential to suppress essential messages if they occur within the RepeatAfter timeframe. So I think functionally this shouldn't be a problem in practice, but technically it is possible for it to occur, which @zyga pointed out to me in a discussion we had a few weeks ago. AFAICT though, the problem of lost signals only remains if one uses RepeatAfter ...?

Apr 17 '24 17:04 olivercalder

Rebased.

May 07 '24 11:05 sergio-costas

snapd snapd copied to clipboard

notices: Ensure that two notices don't have the same lastDateTime

snapd
snapd copied to clipboard