liftbridge
liftbridge copied to clipboard
Add latest offset and timestamp to activity stream events
We would like to add the latest partition message offset and timestamp to pause/resume and readonly activity stream events, and were wondering how to best implement that.
raftNode.applyOperation
returns an ApplyFuture
that has a Response()
that could be used to return this information. The partition leader, when applyPauseStream
is called, could return the latest message offset and timestamp. The metadata leader publishes activity stream messages though, so this information has to get to it somehow.
Does this seem a good way to implement this feature or is there a simpler solution?
This is actually a bit tricky I think. Apply
is only called by the metadata leader to apply a Raft operation. The ApplyFuture
returned by Apply
returns the response from the metadata leader's Raft FSM. The issue is the metadata leader may not be the partition leader (or even a follower), so it does not have the information needed to return in the future. Like you say, this information has to get to the metadata leader somehow to publish the activity message, but I'm not seeing a good way to do that.
What if we changed the behavior of Pause
and SetReadonly
API calls to be directed to a partition leader, like it is done for FetchPartitionMetadata
? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.
I suppose that this would make the Pause
and SetReadonly
calls a bit slower as they are now, and this would generate one Raft event per partition instead of only one with a list of partitions. That could be an option when making these calls.
What if we changed the behavior of
Pause
andSetReadonly
API calls to be directed to a partition leader, like it is done forFetchPartitionMetadata
? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.
This could work, though the Pause
/SetReadonly
behavior needs to occur as a result of the Raft FSM anyway, so the only real reason to direct the request to the partition leader would be so it could send the latest offset/timestamp to the metadata leader. The biggest issue I am seeing with this is that Pause
/SetReadonly
operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.
Another option would be to have the metadata leader request this information from the partition leaders, but this introduces failure cases. E.g. what do we do if the RPC times out? Thus there would be no guarantee the information would be present.
The biggest issue I am seeing with this is that Pause/SetReadonly operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.
Couldn't the client send the request to all partition leaders? We could also have a PausePartition
and SetPartitionReadonly
that only operates on one partition.