restate icon indicating copy to clipboard operation
restate copied to clipboard

Trimming journal

Open slinkydeveloper opened this issue 9 months ago • 6 comments

The idea is to let users trim journal given a command, and remove all the commands that happened afterwards + their completions.

Mark & copy up (excluded) the trim point.

slinkydeveloper avatar Mar 12 '25 09:03 slinkydeveloper

The challenge of this feature is the restart mechanism. This is roughly how I'm gonna implement it:

  • Add a new field to invocation status restarts. We start writing this in 1.3, and when not present, its value default to 0 (this makes easy back/front compat)
  • When sending Invoke command to invoker, we send this restarts field. Sending Invoke commands to invoker is a transient message, not written to storage, so all good for versioning
  • When the invoker reads the journal, it reads the invocation status, and when doing so it also reads this restarts field. If the restarts count doesn't match the one of the invoke command, boom this command is invalid and discarded.
  • When the PP sends an Abort command, it means the state machine either transitioned the invocation to End or incremented the restarts count, thus making sure that the invoker will fence off old streams. Also internally in the invoker <-> invocation task communication this field is used too to fence off old messages
  • When the invoker sends InvokerEffect to PP it attaches the restarts count. The restarts count is not written when 0, thus making sure back-compat is easy.
  • If the invoker gets an Invoke with a higher restarts count from the state machine, it aborts the previous one. Essentially either Abort with restarts or Invoke with restarts + 1 wins.

I'm also gonna proceed to remove the Killed state, as it's not needed anymore.

slinkydeveloper avatar Mar 12 '25 09:03 slinkydeveloper

When the PP sends InvokerEffect it attaches the restarts count. The restarts count is not written when 0, thus making sure back-compat is easy.

Did you mean the invoker instead of the PP?

tillrohrmann avatar Mar 12 '25 15:03 tillrohrmann

Introducing something like an invocation_epoch sounds like a good idea to me. From the top of my head, it should solve the problem that the Killed status tried to solve before in a nicer way.

tillrohrmann avatar Mar 12 '25 15:03 tillrohrmann

fyi @AhmedSoliman

tillrohrmann avatar Mar 12 '25 16:03 tillrohrmann

Updating this with new findings. Fencing off invoker effects is not enough, we also need to fence off completions coming from other PPs belonging to old invocation epochs. This is how i plan to do that:

ServiceInvocationResponseSink and friends need to carry around the invocation epoch of the caller invocation.

Then we need to store the following data structure in the caller invocation status:

max_epoch_per_comp_range: map<numeric range of completion_id, maximum inclusive epoch allowed>

This data structure is updated every time we trim accordingly. The invariant of this data structure is that ranges MUST be NON overlapping. This data structure seems to fit https://docs.rs/rangemap/latest/rangemap/inclusive_map/struct.RangeInclusiveMap.html

And then the algorithm when I get a journal entry (which can be either command, completion or signal) is as follows:

on entry:
  if no entry.epoch or entry is signal: accept
  if entry.epoch equal: accept
  if entry.epoch different:
    if entry is command: discard // This is the case of invoker sending commands for old epochs
    if entry is completion:
      if max_epoch_per_comp_range[completion.id] <= entry.epoch: accept
  all the other cases: discard

slinkydeveloper avatar Mar 12 '25 16:03 slinkydeveloper

For now we won't make awakeable epoch aware, this adds a reasonable amount of complexity and is not even necessarily what the user wants. Plus the trim and restart should be a break the glass operation, alike kill, so it is expected that some inconsistencies might arise. When we get to expose signals, this problem will also go away, as users will manually input the correlation id to complete.

slinkydeveloper avatar Apr 15 '25 07:04 slinkydeveloper

A note about naming and relationship with kill:

  • The name of the operation will be reset:
    • From beginning (this will be present in the UI but not in the Admin API itself)
    • From given entry index
  • By default, it will kill child invocations that have been trimmed
  • By default, it will revert the state

Both kill and reset rest endpoints should start exposing the following knobs:

  • Kill child invocations
  • Revert state

We keep the defaults we have for kill (changing the default would require https://github.com/restatedev/restate/issues/2765), in any case we're gonna play with these defaults in the UI.

slinkydeveloper avatar Apr 30 '25 10:04 slinkydeveloper