etcd
etcd copied to clipboard
HashKV should compute the hash of all MVCC keys up to a given revision
Bug report criteria
- [X] This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
- [X] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
- [X] You have read the etcd bug reporting guidelines.
- [X] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
Based on https://etcd.io/docs/v3.5/dev-guide/api_reference_v3/, HashKV api should compute the hash of all MVCC keys up to a given revision, even if the key/value is tombstone. However, HashKV could skip compacted revision.
https://github.com/etcd-io/etcd/blob/e7f572914d79f0705b3dc8ca28d9a14b0f854d49/server/storage/mvcc/hash.go#L60-L76
For the compacted revision, the lower.GreaterThan(kr) always return true.
Assume that the compacted revision is for key A. If the key A has revisions higher than compacted revision, the HashKV won't compute the hash on compacted revision.
Case 1: ETCD contains more than two keys
For example, there are two keys in store.
// key: "foo"
// modified: 9
// generations:
// {{9, 0}[1]}
// {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
// {{2, 0}, {3, 0}, {4, 0}(t)[3]}
// key: "foo2"
// modified: 2
// generations:
// {{2, 1}[1]}
After compaction on Revision{Main: 5}, the store will be like
// key: "foo"
// modified: 9
// generations:
// {{9, 0}[1]}
// {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
// key: "foo2"
// modified: 2
// generations:
// {{2, 1}[1]}
When we perform HashKV on Revision{Main: 7}, the hash should compute the following revisions:
- Revision{Main: 2, Sub: 1}
- Revision{Main: 5, Sub: 0}
- Revision{Main: 6, Sub: 0}
- Revision{Main: 7, Sub: 0}
However, HashKV skips Revision{Main: 5} because the compacted revision is {Main: 5} and kvindex.Keep returns two revisions {Main: 2, Sub: 1} and {Main: 7}, The Revision{Main: 5} isn't in result of kvindex.Keep.
Case 2: ETCD contains only one key
There is key foo.
// key: "foo"
// modified: 9
// generations:
// {{9, 0}[1]}
// {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
// {{2, 0}, {3, 0}, {4, 0}(t)[3]}
After compaction on Revision{Main: 5}, the store will be like
// key: "foo"
// modified: 9
// generations:
// {{9, 0}[1]}
// {{5, 0}, {6, 0}, {7, 0}, {8, 0}(t)[4]}
When we perform HashKV on Revision{Main: 7}, the hash should compute the following revisions:
- Revision{Main: 5, Sub: 0}
- Revision{Main: 6, Sub: 0}
- Revision{Main: 7, Sub: 0}
HashKV skips Revision{Main: 5} because the compacted revision is {Main: 5} and kvindex.Keep returns one revision {Main: 7}.
However, when we perform HashKV on Revision{Main: 8}, the kvindex.Keep will return empty keep because the Revision{Main: 8} is tombstone. So, the hash result will compute the following revisions, which is expectation.
- Revision{Main: 5, Sub: 0}
- Revision{Main: 6, Sub: 0}
- Revision{Main: 7, Sub: 0}
- Revision{Main: 8, Sub: 0}
What did you expect to happen?
HashKV should compute the hash of all MVCC keys up to a given revision
How can we reproduce it (as minimally and precisely as possible)?
https://github.com/fuweid/etcd/commit/a6454f2fcc13aa116086b4df907541d3942b3b00
$ cd server/storage/mvcc
$ go test -v -run TestHashKVImpactedByKeep -count=1 ./
Anything else we need to know?
The HashKV was introduced by #8263. IMO, the keep, available revision map, was used to skip the revisions which were marked deleted in on-going compaction. If there is no compaction, the keep is always empty and then HashKV will compute all the MVCC keys up to the revision.
#8263 has bug and then #8333 introduced keyIndex.Keep interface to fix bug.
However, after #8333, the non-empty keep map forces HashKV to skip compacted revision.
Etcd version (please run commands below)
- main
- release-3.5
- release-3.4
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
Relevant log output
No response