etcd icon indicating copy to clipboard operation
etcd copied to clipboard

Enhance `etcdutl` to calculate hash of the data up to a given rev

Open ahrtr opened this issue 3 years ago • 10 comments

What would you like to be added?

Currently command etcdctl endpoint hashkv can return the hash up to a given rev of endpoints. It requires the etcdserver is still running. But in some cases, when users raise inconsistency issue, the etcd cluster might not be running any more. So we need to support calculating the hash offline, in other words, we need to support calculating hash using etcdutl as well.

Why is this needed?

It can improve the diagnosability / supportability

ahrtr avatar Jan 04 '23 08:01 ahrtr

any thoughts? @ptabor @serathius @spzala

ahrtr avatar Jan 04 '23 08:01 ahrtr

hi, I'd like to take this, is the command etcdutl hashkv input arg is data-dir ?

halegreen avatar Jan 08 '23 16:01 halegreen

is the command etcdutl hashkv input arg is data-dir ?

Yes, it looks good to me. Assigned to you, thx.

ahrtr avatar Jan 08 '23 23:01 ahrtr

I would think about a longer term perspective for the 'etcdutl' tool. I think we should eventually get rid of etcd-dump-logs and etcd-dump-db tools and position etcdutl as the tool for performing low-level operations on etcd files.

Let's look what we have currently from consistency perspective

./etcdutl snapshot status ./default.etcd/member/snap/db

./etcdutl snapshot restore <filename> --data-dir {output dir} [options] [flags]

And the latter command has support for:

 --skip-hash-check                      Ignore snapshot integrity hash value (required if copied from data directory)

So it seems that etcdutl snapshot {foo} <filename> is the pattern for command working on 'bbolt' files and we should preserve this.

Now the question is whether we need: etcdutl snapshot hashkv {file} Or etcdutl snapshot status is good enough to extend.

I would extend etcdutl snapshot status. It already has linear complexity (walks over the whole storage):

https://github.com/ptabor/etcd/blob/6f899a7b40f7631461ffeda0067aa4c42dd17812/etcdutl/snapshot/v3_snapshot.go#L142

So there is no 'order of magnitude' cost change if we compute the hash, side by side to the original functionality.

So I envision this as:

./bin/etcdutl snapshot status -w json ./default.etcd/member/snap/db
{"hash":3884838507,"revision":16541,"totalKey":33078,"totalSize":33112064,"version":"3.6.0", "hashkv":"..."}

@ahrtr @serathius FDYT ?

ptabor avatar Jan 10 '23 16:01 ptabor

I think we should eventually get rid of etcd-dump-logs and etcd-dump-db tools and position etcdutl as the tool for performing low-level operations on etcd files.

It seems a good direction to me. It doesn't make sense to have several scattered tools. It's good to have only one offline data analyzing tool etcdutl . I may spend some time to plan/think about it.

Now the question is whether we need: etcdutl snapshot hashkv {file} Or etcdutl snapshot status is good enough to extend.

I would extend etcdutl snapshot status. It already has linear complexity (walks over the whole storage):

https://github.com/ptabor/etcd/blob/6f899a7b40f7631461ffeda0067aa4c42dd17812/etcdutl/snapshot/v3_snapshot.go#L142

So there is no 'order of magnitude' cost change if we compute the hash, side by side to the original functionality.

So I envision this as:

./bin/etcdutl snapshot status -w json ./default.etcd/member/snap/db
{"hash":3884838507,"revision":16541,"totalKey":33078,"totalSize":33112064,"version":"3.6.0", "hashkv":"..."}

I prefer to etcdutl snapshot hashkv {file}, and we need to support flag --rev ${rev} so that we can calculate the hash up to the given rev. It defaults to 0, and it should have the same hash value as the etcdutl snapshot status in this case. I expect it has similar output as etcdctl endpoint hashkv (of course without the field endpoint).

I am not worry about 'order of magnitude' cost change. From implementation level, we can definitely reuse the same code below (of course some minor change is needed),

https://github.com/etcd-io/etcd/blob/ff898640a5c9bad0bb99a74d0799f810d54b3586/etcdutl/snapshot/v3_snapshot.go#L142-L167

ahrtr avatar Jan 10 '23 19:01 ahrtr

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar May 21 '23 17:05 stale[bot]

/cc @cenkalti

chaochn47 avatar May 24 '23 06:05 chaochn47

Re-open to update https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md.

jmhbnz avatar Jun 03 '24 20:06 jmhbnz

Discussed during sig-etcd triage meeting. Changed to good-first-issue for someone to add missing CHANGELOG entry for 3.6.

jmhbnz avatar Aug 15 '24 18:08 jmhbnz

Hi everyone, I've updated the changelog in this pr : https://github.com/etcd-io/etcd/pull/18460, please do let me know if I need to make any adjustments since I'm new to this project. Thank you.

mello369 avatar Aug 17 '24 11:08 mello369

Closing as complete. https://github.com/etcd-io/etcd/pull/18460 updated the CHANGELOG.

ivanvc avatar Sep 19 '24 00:09 ivanvc