consistent expiration semantics
Currently, each thread updates its expiry clock independently from others and independently from the ongoing transactions.
- We should move expiration clock sampling to transaction. Transaction should have a constant number used through all its operations, otherwise we risk having inconsistent replies within a transaction for GET X...some delay GET X sequences.
- We should consider having a single atomic variable for the clock so that transactions clocks will have monotonic property. I do not think it's strictly necessary but we may consider it.
See https://ably.com/blog/redis-keys-do-not-expire-atomically for motivation.
if we use transactions to set expiry or to access multiple possibly expired items, we should use a consistent clock value during the transaction execution.
current behavior: we use real clock to set expiry and to test it.
high level design
We can use lamport clock mixed with the wall clock.
- When we start a transaction we sample the wall clock, and bind its value to that transaction.
- When we perform the operations on DbSlice we should use the clock value from that transaction.
For that we should replace all DbSlice interfaces like
FindExt(DbIndex db_ind, std::string_view key) const;toFindExt(const Context& cntx, std::string_view key) const;and pass the clock viaContext. In fact, I will probably adapt the interfaces ASAP. - DbSlice::now_ms_ should disappear.
- We do delete items outside of transactions (active/bakground expiry) and this is why we need the lamport clock semantics.
We will provide a shard level
now_ms_clock updated transparently by incoming transactions based onmax(old, new)logic. This won't require any walltime sampling - just a simple integer update. Background processes will use this clock and it will be guaranteed that this clock won't delete items that the ongoing transaction does not consider expired. We can transparently update shard clock clock insideEngineShard::RunInShardmethod .
Some notes:
-
sampling high-resolution clock is relatively expensive (~100ns CPU time) - but absl::GetCurrentTimeNanos() is probably faster. I use it in
EngineShard::Heartbeat. We should add the explicit benchmarks to dragonfly_test -
(copy them from https://github.com/romange/gaia/blob/master/base/walltime_test.cc#L155)
-
Once we get rid of
UpdateExpireClockcalls in heartbit, we can decrease the frequency of heartbit and get rid of hz flag because todayhz=1000is only because we must update expire clock every ms.
was fixed in 1.0