cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

raft: advance commit index safely

Open lyang24 opened this issue 10 months ago • 4 comments

This change makes the commit index advancement in handleHeartbeat safe. Previously, a follower would attempt to update the commit index to whichever was sent in the MsgHeartbeat message. Out-of-bound indices would crash the node.

It is always safe to advance a commit index if the follower's log is "in sync" with the leader, i.e. when its log is guaranteed to be a prefix of the leader's log. This is always true if the term of last entry in the log matches the leader's term, otherwise this guarantee is established when the first MsgApp append message from the leader succeeds.

At the moment, the leader will never send a commit index that exceeds the follower's log size. However, this may change in future. This change is a defence-in-depth.

The newly added accTerm field will be used for other safety checks in the future, for example to establish that overriding a suffix of entries in raftLog is safe.

Part of #122100

lyang24 avatar Apr 19 '24 17:04 lyang24

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I was unable to automatically find a reviewer. You can try CCing one of the following members:

  • A person you worked with closely on this PR.
  • The person who created the ticket, or a CRDB organization member involved with the ticket (author, commenter, etc.).
  • Join our community slack channel and ask on #contributors.
  • Try find someone else from here.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl[bot] avatar Apr 19 '24 17:04 blathers-crl[bot]

This change is Reviewable

cockroach-teamcity avatar Apr 19 '24 17:04 cockroach-teamcity

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl[bot] avatar Apr 22 '24 04:04 blathers-crl[bot]

LGTM, but let's have a second review. @nvanbenschoten Could you PTAL?

pav-kv avatar May 10 '24 12:05 pav-kv

@lyang24 the blocking PR has merged. Please address Nathan's comments, and it should be safe to merge this one.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lyang24)

pkg/raft/raft.go line 1729 at r4 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote… The PR has merged. It should be safe to make the change now.

Hi I have rebased to master will merge when tests passes and will work on the follow up prs.

lyang24 avatar May 21 '24 07:05 lyang24

bors r+

lyang24 avatar May 21 '24 08:05 lyang24

:clock1: Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

craig[bot] avatar May 21 '24 08:05 craig[bot]

bors cancel

pav-kv avatar May 21 '24 10:05 pav-kv

@lyang24 there are some outstanding comments from @nvanbenschoten, could you please address them first?

@nvanbenschoten Do you want to take a final pass / LGTM?

pav-kv avatar May 21 '24 10:05 pav-kv

bors r+

lyang24 avatar May 23 '24 22:05 lyang24