SpacetimeDB icon indicating copy to clipboard operation
SpacetimeDB copied to clipboard

`Timestamp::now()` is non-monotonic, which causes reducer invocation timestamps to be non-monotonic

Open gefjon opened this issue 9 months ago • 1 comments

From a report on Discord.

Timestamp::now() reads SystemTime, which is non-monotonic, per the docs:

Distinct from the Instant type, this time measurement is not monotonic.

(Emphasis in the original text.)

This is unfortunate for reducer timestamps, as it means it's possible for a repeating reducer to have a later invocation with an earlier timestamp than its predecessor. Note that in an MVCC world, we still won't expect TX timestamps to be in the same order as their serialized/committed order. The undesirable behavior is strictly related to scheduled reducers. We believe it should be the case that, if a reducer A schedules a reducer B, then A happens-before B, where happens-before implies both that the TX offset of A is less than the TX offset of B, and that the reducer timestamp of A is less than (or equal to?) the reducer timestamp of B.

I can see a few ways to achieve this:

  • On process startup, take a reading from both SystemTime and Instant, and treat them as referring to the same point in time. Then, have Timestamp::now() read from Instant (which is monotonic), and use the known point in time to translate the read Instant into an offset since the Unix epoch.
  • Keep a (thread-local?) cell containing the latest SystemTime or Timestamp ever read from the system clock, the "high water mark." In Timestamp::now(), if the system clock reports a value less than the HWM, return the HWM instead. Like:
impl Timestamp {
    fn now() -> Self {
        thread_local! { static CLOCK_HIGH_WATER_MARK: Cell<SystemTime> = Cell::new(SystemTime::UNIX_EPOCH); }

        let sys_clock = SystemTime::now();
        let hwm = CLOCK_HIGH_WATER_MARK.get();
        if hwm > sys_clock {
            Self::from_system_time(hwm)
        } else {
            CLOCK_HIGH_WATER_MARK.set(sys_clock);
            Self::from_system_time(sys_clock)
        }
    }
}

We probably need to do some investigation to determine which of these options is preferable, or if there are other ways to get a monotonic wall-time clock. This seems likely to be the kind of thing that has serious pitfalls if done wrong.

gefjon avatar Mar 29 '25 18:03 gefjon

The closest thing to a monotonic wall-time clock I’m aware of is TAI. It requires a leap seconds table to convert to UTC, which is often impractical.

If we don’t want to make any guarantees about reducer timestamps in general, which might be wise, we could also just compute the timestamp of repeating reducers as the elapsed time since the last execution. That time can be tracked in the scheduled table.

kim avatar Mar 31 '25 16:03 kim

#2618

gefjon avatar Apr 28 '25 20:04 gefjon