etcd icon indicating copy to clipboard operation
etcd copied to clipboard

HashKV should compute the hash of all MVCC keys up to a given revision

Open fuweid opened this issue 1 year ago • 6 comments

Bug report criteria

  • [X] This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
  • [X] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
  • [X] You have read the etcd bug reporting guidelines.
  • [X] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

Based on https://etcd.io/docs/v3.5/dev-guide/api_reference_v3/, HashKV api should compute the hash of all MVCC keys up to a given revision, even if the key/value is tombstone. However, HashKV could skip compacted revision.

https://github.com/etcd-io/etcd/blob/e7f572914d79f0705b3dc8ca28d9a14b0f854d49/server/storage/mvcc/hash.go#L60-L76

For the compacted revision, the lower.GreaterThan(kr) always return true. Assume that the compacted revision is for key A. If the key A has revisions higher than compacted revision, the HashKV won't compute the hash on compacted revision.

Case 1: ETCD contains more than two keys

For example, there are two keys in store.

// key: "foo"
// modified: 9
// generations:
//    {{9, 0}[1]}
//    {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
//    {{2, 0}, {3, 0}, {4, 0}(t)[3]}

// key: "foo2"
// modified: 2
// generations:
//    {{2, 1}[1]}

After compaction on Revision{Main: 5}, the store will be like

// key: "foo"
// modified: 9
// generations:
//    {{9, 0}[1]}
//    {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}

// key: "foo2"
// modified: 2
// generations:
//    {{2, 1}[1]}

When we perform HashKV on Revision{Main: 7}, the hash should compute the following revisions:

  • Revision{Main: 2, Sub: 1}
  • Revision{Main: 5, Sub: 0}
  • Revision{Main: 6, Sub: 0}
  • Revision{Main: 7, Sub: 0}

However, HashKV skips Revision{Main: 5} because the compacted revision is {Main: 5} and kvindex.Keep returns two revisions {Main: 2, Sub: 1} and {Main: 7}, The Revision{Main: 5} isn't in result of kvindex.Keep.

Case 2: ETCD contains only one key

There is key foo.

// key: "foo"
// modified: 9
// generations:
//    {{9, 0}[1]}
//    {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
//    {{2, 0}, {3, 0}, {4, 0}(t)[3]}

After compaction on Revision{Main: 5}, the store will be like

// key: "foo"
// modified: 9
// generations:
//    {{9, 0}[1]}
//    {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}

When we perform HashKV on Revision{Main: 7}, the hash should compute the following revisions:

  • Revision{Main: 5, Sub: 0}
  • Revision{Main: 6, Sub: 0}
  • Revision{Main: 7, Sub: 0}

HashKV skips Revision{Main: 5} because the compacted revision is {Main: 5} and kvindex.Keep returns one revision {Main: 7}.

However, when we perform HashKV on Revision{Main: 8}, the kvindex.Keep will return empty keep because the Revision{Main: 8} is tombstone. So, the hash result will compute the following revisions, which is expectation.

  • Revision{Main: 5, Sub: 0}
  • Revision{Main: 6, Sub: 0}
  • Revision{Main: 7, Sub: 0}
  • Revision{Main: 8, Sub: 0}

What did you expect to happen?

HashKV should compute the hash of all MVCC keys up to a given revision

How can we reproduce it (as minimally and precisely as possible)?

https://github.com/fuweid/etcd/commit/a6454f2fcc13aa116086b4df907541d3942b3b00

$ cd server/storage/mvcc
$ go test -v -run TestHashKVImpactedByKeep -count=1 ./

Anything else we need to know?

The HashKV was introduced by #8263. IMO, the keep, available revision map, was used to skip the revisions which were marked deleted in on-going compaction. If there is no compaction, the keep is always empty and then HashKV will compute all the MVCC keys up to the revision.

#8263 has bug and then #8333 introduced keyIndex.Keep interface to fix bug. However, after #8333, the non-empty keep map forces HashKV to skip compacted revision.

Etcd version (please run commands below)

  • main
  • release-3.5
  • release-3.4

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

No response

fuweid avatar Jul 16 '24 13:07 fuweid