Enhance `etcdutl` to calculate hash of the data up to a given rev
What would you like to be added?
Currently command etcdctl endpoint hashkv can return the hash up to a given rev of endpoints. It requires the etcdserver is still running. But in some cases, when users raise inconsistency issue, the etcd cluster might not be running any more. So we need to support calculating the hash offline, in other words, we need to support calculating hash using etcdutl as well.
Why is this needed?
It can improve the diagnosability / supportability
any thoughts? @ptabor @serathius @spzala
hi, I'd like to take this, is the command etcdutl hashkv input arg is data-dir ?
is the command
etcdutl hashkvinput arg isdata-dir?
Yes, it looks good to me. Assigned to you, thx.
I would think about a longer term perspective for the 'etcdutl' tool. I think we should eventually get rid of etcd-dump-logs and etcd-dump-db tools and position etcdutl as the tool for performing low-level operations on etcd files.
Let's look what we have currently from consistency perspective
./etcdutl snapshot status ./default.etcd/member/snap/db
./etcdutl snapshot restore <filename> --data-dir {output dir} [options] [flags]
And the latter command has support for:
--skip-hash-check Ignore snapshot integrity hash value (required if copied from data directory)
So it seems that
etcdutl snapshot {foo} <filename> is the pattern for command working on 'bbolt' files and we should preserve this.
Now the question is whether we need:
etcdutl snapshot hashkv {file}
Or
etcdutl snapshot status is good enough to extend.
I would extend etcdutl snapshot status. It already has linear complexity (walks over the whole storage):
https://github.com/ptabor/etcd/blob/6f899a7b40f7631461ffeda0067aa4c42dd17812/etcdutl/snapshot/v3_snapshot.go#L142
So there is no 'order of magnitude' cost change if we compute the hash, side by side to the original functionality.
So I envision this as:
./bin/etcdutl snapshot status -w json ./default.etcd/member/snap/db
{"hash":3884838507,"revision":16541,"totalKey":33078,"totalSize":33112064,"version":"3.6.0", "hashkv":"..."}
@ahrtr @serathius FDYT ?
I think we should eventually get rid of
etcd-dump-logsandetcd-dump-dbtools and positionetcdutlas the tool for performing low-level operations on etcd files.
It seems a good direction to me. It doesn't make sense to have several scattered tools. It's good to have only one offline data analyzing tool etcdutl . I may spend some time to plan/think about it.
Now the question is whether we need:
etcdutl snapshot hashkv {file}Oretcdutl snapshot statusis good enough to extend.I would extend
etcdutl snapshot status. It already has linear complexity (walks over the whole storage):https://github.com/ptabor/etcd/blob/6f899a7b40f7631461ffeda0067aa4c42dd17812/etcdutl/snapshot/v3_snapshot.go#L142
So there is no 'order of magnitude' cost change if we compute the hash, side by side to the original functionality.
So I envision this as:
./bin/etcdutl snapshot status -w json ./default.etcd/member/snap/db {"hash":3884838507,"revision":16541,"totalKey":33078,"totalSize":33112064,"version":"3.6.0", "hashkv":"..."}
I prefer to etcdutl snapshot hashkv {file}, and we need to support flag --rev ${rev} so that we can calculate the hash up to the given rev. It defaults to 0, and it should have the same hash value as the etcdutl snapshot status in this case. I expect it has similar output as etcdctl endpoint hashkv (of course without the field endpoint).
I am not worry about 'order of magnitude' cost change. From implementation level, we can definitely reuse the same code below (of course some minor change is needed),
https://github.com/etcd-io/etcd/blob/ff898640a5c9bad0bb99a74d0799f810d54b3586/etcdutl/snapshot/v3_snapshot.go#L142-L167
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
/cc @cenkalti
Re-open to update https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md.
Discussed during sig-etcd triage meeting. Changed to good-first-issue for someone to add missing CHANGELOG entry for 3.6.
Hi everyone, I've updated the changelog in this pr : https://github.com/etcd-io/etcd/pull/18460, please do let me know if I need to make any adjustments since I'm new to this project. Thank you.
Closing as complete. https://github.com/etcd-io/etcd/pull/18460 updated the CHANGELOG.