liftbridge icon indicating copy to clipboard operation
liftbridge copied to clipboard

Add latest offset and timestamp to activity stream events

Open Jmgr opened this issue 3 years ago • 4 comments

We would like to add the latest partition message offset and timestamp to pause/resume and readonly activity stream events, and were wondering how to best implement that.

raftNode.applyOperation returns an ApplyFuture that has a Response() that could be used to return this information. The partition leader, when applyPauseStream is called, could return the latest message offset and timestamp. The metadata leader publishes activity stream messages though, so this information has to get to it somehow.

Does this seem a good way to implement this feature or is there a simpler solution?

Jmgr avatar Mar 16 '21 18:03 Jmgr

This is actually a bit tricky I think. Apply is only called by the metadata leader to apply a Raft operation. The ApplyFuture returned by Apply returns the response from the metadata leader's Raft FSM. The issue is the metadata leader may not be the partition leader (or even a follower), so it does not have the information needed to return in the future. Like you say, this information has to get to the metadata leader somehow to publish the activity message, but I'm not seeing a good way to do that.

tylertreat avatar Mar 17 '21 01:03 tylertreat

What if we changed the behavior of Pause and SetReadonly API calls to be directed to a partition leader, like it is done for FetchPartitionMetadata? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.

I suppose that this would make the Pause and SetReadonly calls a bit slower as they are now, and this would generate one Raft event per partition instead of only one with a list of partitions. That could be an option when making these calls.

Jmgr avatar Mar 24 '21 14:03 Jmgr

What if we changed the behavior of Pause and SetReadonly API calls to be directed to a partition leader, like it is done for FetchPartitionMetadata? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.

This could work, though the Pause/SetReadonly behavior needs to occur as a result of the Raft FSM anyway, so the only real reason to direct the request to the partition leader would be so it could send the latest offset/timestamp to the metadata leader. The biggest issue I am seeing with this is that Pause/SetReadonly operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.

Another option would be to have the metadata leader request this information from the partition leaders, but this introduces failure cases. E.g. what do we do if the RPC times out? Thus there would be no guarantee the information would be present.

tylertreat avatar Mar 26 '21 17:03 tylertreat

The biggest issue I am seeing with this is that Pause/SetReadonly operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.

Couldn't the client send the request to all partition leaders? We could also have a PausePartition and SetPartitionReadonly that only operates on one partition.

Jmgr avatar Mar 29 '21 16:03 Jmgr