etcd
etcd copied to clipboard
Always keep the compact revision, and delay compact tombstone revision
Always keep the compact revision, no matter it's a tomestone revision or not. If it's a tombstone revision, then cleanup it in next compact operation.
PoC of option 2 in https://docs.google.com/document/d/1APJE38DxENsRLLVapRz1CQ6UD4-uYaFutO12Y01VcNw/edit
cc @fuweid
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 68.81%. Comparing base (
a6bb22d) to head (763fee3). Report is 2 commits behind head on main.
:exclamation: Current head 763fee3 differs from pull request most recent head 891e412
Please upload reports for the commit 891e412 to get more accurate results.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
| Files | Coverage Δ | |
|---|---|---|
| server/storage/mvcc/key_index.go | 71.25% <100.00%> (+7.09%) |
:arrow_up: |
... and 25 files with indirect coverage changes
@@ Coverage Diff @@
## main #18162 +/- ##
==========================================
- Coverage 68.94% 68.81% -0.14%
==========================================
Files 416 416
Lines 35127 35121 -6
==========================================
- Hits 24219 24169 -50
- Misses 9518 9556 +38
- Partials 1390 1396 +6
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update a6bb22d...891e412. Read the comment docs.
@fuweid Please feel free to continue to work on the issue https://github.com/etcd-io/etcd/issues/18089 on top of this PR if it makes sense to you. Pls feel free to ping me on slack if you want discussion.
These are all the test cases I can think of, please feel free to add more. Most likely some of them have already been covered by existing test cases.
I also added the test cases in the doc (see option 2)
cc @fuweid @serathius
Test cases
unit test on keyIndex
case 1: compact a tombstone revision X, it shouldn't be compacted. Next compaction should cleanup the revision X for both cases below,
- 1.1 The revision X is the last revision for the key
- 1.2 The revision X isn't the last revision for the key
case 2: compact a normal revision X, it shouldn't be compacted. In next compaction,
- 2.1 If the revision X is the last revision for the key, it shouldn't be compacted.
- 2.2 If the revision X isn't the last revision for the key, it should be compacted.
e2e test on watch
case 1: compact tomestone revisions shouldn't impact watch rough steps: 1) start a watch client to watch on a key 2) start a goroutine to continuous put or delete a key 3) perform compaction against the different tombstone revisions multiple times, verify that no revisions are dropped.
case 2: compact normal revisions shouldn't impact watch similar steps as above, but we only execute put operations
case 3: a new watch starting on a compacted tombstone revision should receive the delete event Refer to the reproduction steps in the issue
e2e test on hash values
case 1: All members should have the same hash value up to the compacted revision after compacting a tombstone revision rough steps: 1) start a 3 member cluster; 2) execute some put and delete operations; 3) execute compaction on revision X; 4) get hashKV up to revision X from all members, verify the hash values are consistent
- 1.1 all members have the same etcd version
- 1.2 mix versions, i.e. main vs 3.5.14 or release-3.5 vs 3.5.14
case 2: all member should have the same hash value up to the compacted revision after compacting a non-tombstone revision similar steps as above, but execute compact on a non-tombstone revision
- 2.1 all members have the same etcd version
- 2.2 mix versions, i.e. main vs 3.5.14 or release-3.5 vs 3.5.14
case 3: All members should have the same hash value up to a non-compacted revision after compacting a tombstone revision similar steps as case 1, but get hashKv up to a non-compacted revision
- 3.1 all members have the same etcd version
- 3.2 mix versions, i.e. main vs 3.5.14 or release-3.5 vs 3.5.14
case 4: All members should have the same hash value up to a non-compacted revision after compacting a non-tombstone revision similar steps as case 2, but get hashKv up to a non-compacted revision
- 4.1 all members have the same etcd version
- 4.2 mix versions, i.e. main vs 3.5.14 or release-3.5 vs 3.5.14
robustness test
Support compaction, and verify the impact on hash values and watch.
@fuweid are you still working on https://github.com/etcd-io/etcd/issues/18089 based on this PR?
Can we start implementing those tests?
The bigger problem then the bug existing is fact that we don't have those tests already. Even before we have a fix, we can just add the test to codify the current behavior. It should give us better visibility into impact of when we implement the fix.
/cc @MadhavJivrajani @siyuanfoundation @fuweid @jmhbnz @henrybear327 @ah8ad3
@serathius: GitHub didn't allow me to request PR reviews from the following users: ah8ad3.
Note that only etcd-io members and repo collaborators can review this PR, and authors cannot review their own PRs.
In response to this:
/cc @MadhavJivrajani @siyuanfoundation @fuweid @jmhbnz @henrybear327 @ah8ad3
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
@ahrtr: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-etcd-unit-test-386 | 891e4124b1853cb5def231d89d89ec4a29ddba96 | link | false | /test pull-etcd-unit-test-386 |
| pull-etcd-unit-test-amd64 | 891e4124b1853cb5def231d89d89ec4a29ddba96 | link | true | /test pull-etcd-unit-test-amd64 |
| pull-etcd-unit-test-arm64 | 30eaf124d8eaeeb7c05fc2845df708b8e8609e8b | link | true | /test pull-etcd-unit-test-arm64 |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
Can we start implementing those tests?
The bigger problem then the bug existing is fact that we don't have those tests already. Even before we have a fix, we can just add the test to codify the current behavior. It should give us better visibility into impact of when we implement the fix.
/cc @MadhavJivrajani @siyuanfoundation @fuweid @jmhbnz @henrybear327 @ah8ad3
I would like to start working on those tests but i am a little bit lost of track if anybody started it or not is there any update?