antidote icon indicating copy to clipboard operation
antidote copied to clipboard

Time handling in antidote

Open peterzeller opened this issue 8 years ago • 6 comments

With the change to Erlang 19 we removed all calls to erlang:now.

There are still some open problems with this change and the handling of time in general:

  • [ ] Monotonicity: Replacing erlang:now with erlang:system_time (as we did) should work, as system_time is still monotonic with the default time warp setting ("No Time Warp Mode"). However, this means that our code is not time warp safe and would be incorrect if Erlang is started with time warp mode enabled (which is recommended).
  • [ ] Uniqueness: Most old uses of erlang:now do not need uniqueness. I think it might still be required for generating transaction-ids (in clocksi_interactive_tx_coord_fsm, line 171). There might be other places.
  • [ ] Restarts: erlang:now and erlang:system_time seem to loose their guarantees after a system restart. So after a restart we might get an older time stamp, which could break the protocol.

peterzeller avatar Oct 06 '16 11:10 peterzeller

Is someone assigned to work on this?

cmeiklejohn avatar Oct 07 '16 19:10 cmeiklejohn

Not a fix, but a suggestion is to run NTP on all nodes in a DC before any of the erlang VMs are started.

My understanding is that "time warp" is for large clock corrections, but there is also "time correction" which adjusts the erlang clock frequency by a small amount without violating monotonicity. So if system clocks are synced before the start and continually then the hope is that "time correction" will be sufficient. Though there might be other downsides of using no time warp mode? Maybe worth testing.

The uniqueness of txids that you mentioned and after restarts do look like issues that could happen, but should be easy to fix I guess.

tcrain avatar Oct 10 '16 04:10 tcrain

Looking at the new Erlang API for time correction, I think we can fix this issue now.

The time functions currently used are:

bcoutner_mgr.erl: erlang:timestamp() (4 times) dc_utilities.erl: erlang:system_time(micro_seconds)

and two calls to rand:seed: clocksi_interactive_coord.erl, line 528 interactive_dc_query_receive_socket.erl, line 107

The generation of transaction IDs is currently handled by the call to the dc_utilities function.

Looking at both the current Erlang time correction documentation and random numbers documentation, we could do the following:

  • Set the time warp VM argument: +C multi_time_warp
  • Use a tuple to create strictly monotonic timestamps (for dc_utilities), which will also uphold the guarantees we need after a restart:
Time = erlang:monotonic_time(),
UMI = erlang:unique_integer([monotonic]),
EventTag = {Time, UMI}
  • According to the rand documentation, calling the seed is not needed. The processes state is seeded once when calling the rand module for the first time. So we could remove the two rand:seed calls.

Does this solve the problems we currently have with time handling?

I do not really know how the bounded counter manager works, so I'd need input on what guarantees it needs. Currently the bounded counter manager uses erlang:timestamp. The timestamp function does not give any guarantees (no monotonicity nor uniquess) to my knowledge.

albsch avatar Oct 24 '19 09:10 albsch

@balegas ?

bieniusa avatar Oct 25 '19 13:10 bieniusa

I was checking the code and I think that timestamps in the bounded counter manager are used to timeout resource transfers requests. It does not depend on timestamps to set identifiers, or ordering.

balegas avatar Oct 25 '19 13:10 balegas

Is this issue still open? I implemented a very simple solution for this problem in gingko but I don't know if it performs well Basically I used a gen_server that made sure that every new timestamp (regular erlang timestamp translated to microseconds) was strictly monotonic and if it was not then the last timestamp was incremented by one and used instead

FairPlayer4 avatar Sep 15 '20 15:09 FairPlayer4