k3s icon indicating copy to clipboard operation
k3s copied to clipboard

Secrets reencryption fails on 8K+ secrets

Open dereknola opened this issue 3 years ago • 1 comments
trafficstars

Cluster Configuration: 3 Server HA, but is seen on any cluster config

Describe the bug:

K3s fails to reencrypt all the secrets on a cluster, it stops and exists the reencryption loop after 8000+ secrets. No immediate error is given, but K3s will begin to break as the secrets that were not reencrypted become inaccessible as the previous key was destroyed. Steps To Reproduce:

  • Create a 3 Server HA config with --secrets-encryption enabled
  • Add 10K secrets to the cluster kubectl create namespace test-encryption-key-rotation && for i in {1..10000}; do kubectl create secret generic test$i -n test-encryption-key-rotation --from-literal=key$1=value$i; done
  • Prepare, rotate, and renecrypt using the k3s secrets-encrypt tool as normal. See docs
  • Watching the logs during the renecryption phase. The reencryption should end before all 10K secrets have been processed, the logs will state something like:
level=info ...... reason: 'SecretsProgress' reencrypted 8100 secrets"
 level=info .....  reason: 'SecretsUpdateComplete' completed reencrypt of 8105 secrets"
 level=info msg="Removed key:  Name: aescbckey-2022-08-02T16:02:40Z, Secret: [REDACTED]"

Expected behavior: All 10K secrets are reencrypted

Actual behavior: Only "most of" the 10K secrets are reencrypted

Additional context / logs:

Backporting

  • [X] Needs backporting to older releases

dereknola avatar Aug 02 '22 17:08 dereknola

Validated on master branch with commit 6b7b9c5aa98efa2aabd2cfa872b1d214179161be

Environment Details

Infrastructure

  • [x] Cloud
  • [ ] Hosted

Node(s) CPU architecture, OS, and Version:

Linux ip-172-31-0-94 5.4.0-1009-aws #9-Ubuntu SMP Sun Apr 12 19:46:01 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"

Cluster Configuration:

HA: 3 servers, 1 agent

Config.yaml:

# server nodes
secrets-encryption: true

Testing Steps

  • Followed the steps mentioned here: https://github.com/k3s-io/k3s/issues/5933#issue-1326148682

Replication Results:

  • k3s version used for replication:
k3s -v
k3s version v1.24.3+k3s1 (990ba0e8)
go version go1.18.1
  • Tailing the logs showed that even 10K secrets were created, only 8500 got reencrypted
Aug 08 15:20:01 ip-172-31-8-231 k3s[140261]: time="2022-08-08T15:20:01Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"\", Name:\"ip-172-31-8-231\", UID:\"ip-172-31-8-231\", APIVersion:\"\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'SecretsProgress' reencrypted 8490 secrets"
Aug 08 15:20:03 ip-172-31-8-231 k3s[140261]: time="2022-08-08T15:20:03Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"\", Name:\"ip-172-31-8-231\", UID:\"ip-172-31-8-231\", APIVersion:\"\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'SecretsUpdateComplete' completed reencrypt of 8500 secrets"
Aug 08 15:20:03 ip-172-31-8-231 k3s[140261]: time="2022-08-08T15:20:03Z" level=info msg="Removed key:  Name: aescbckey-2022-08-08T14:48:27Z, Secret: [REDACTED]"
Aug 08 15:20:03 ip-172-31-8-231 k3s[140261]: time="2022-08-08T15:20:03Z" level=warning msg="bootstrap key already exists"

Validation Results:

  • k3s version used for validation:
k3s -v
k3s version v1.24.3+k3s-6b7b9c5a (6b7b9c5a)
go version go1.18.1
  • Tailing the logs showed that all 10K secrets got reencrypted
Aug 08 16:49:28 ip-172-31-8-231 k3s[258043]: time="2022-08-08T16:49:28Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"\", Name:\"ip-172-31-8-231\", UID:\"ip-172-31-8-231\", APIVersion:\"\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'SecretsProgress' reencrypted 9990 secrets"
Aug 08 16:49:30 ip-172-31-8-231 k3s[258043]: time="2022-08-08T16:49:30Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"\", Name:\"ip-172-31-8-231\", UID:\"ip-172-31-8-231\", APIVersion:\"\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'SecretsProgress' reencrypted 10000 secrets"
Aug 08 16:49:31 ip-172-31-8-231 k3s[258043]: time="2022-08-08T16:49:31Z" level=info msg="Removed key:  Name: aescbckey-2022-08-08T16:08:16Z, Secret: [REDACTED]"
Aug 08 16:49:31 ip-172-31-8-231 k3s[258043]: time="2022-08-08T16:49:31Z" level=info msg="Event(v1.ObjectReference{Kind:\"Node\", Namespace:\"\", Name:\"ip-172-31-8-231\", UID:\"ip-172-31-8-231\", APIVersion:\"\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'SecretsUpdateComplete' completed reencrypt of 10007 secrets"
Aug 08 16:49:31 ip-172-31-8-231 k3s[258043]: time="2022-08-08T16:49:31Z" level=warning msg="bootstrap key already exists"
  • Hexdump
$ sudo ETCDCTL_API=3 etcdctl --cert /var/lib/rancher/k3s/server/tls/etcd/server-client.crt --key /var/lib/rancher/k3s/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/k3s/server/tls/etcd/server-ca.crt get /registry/secrets/test-encryption-key-rotation/test10000 | hexdump -C
00000000  2f 72 65 67 69 73 74 72  79 2f 73 65 63 72 65 74  |/registry/secret|
00000010  73 2f 74 65 73 74 2d 65  6e 63 72 79 70 74 69 6f  |s/test-encryptio|
00000020  6e 2d 6b 65 79 2d 72 6f  74 61 74 69 6f 6e 2f 74  |n-key-rotation/t|
00000030  65 73 74 31 30 30 30 30  0a 6b 38 73 3a 65 6e 63  |est10000.k8s:enc|
00000040  3a 61 65 73 63 62 63 3a  76 31 3a 61 65 73 63 62  |:aescbc:v1:aescb|
00000050  63 6b 65 79 2d 32 30 32  32 2d 30 38 2d 30 38 54  |ckey-2022-08-08T|
00000060  31 36 3a 30 38 3a 31 36  5a 3a 06 60 86 f9 76 1c  |16:08:16Z:.`..v.|
00000070  e5 40 72 0b 95 57 59 d4  20 e5 d9 2f 3e 16 60 52  |[email protected]. ../>.`R|
00000080  f9 83 03 4d 2a a5 06 79  8e 49 a7 c2 ba 9b b3 5e  |...M*..y.I.....^|
00000090  2e b4 50 1b 5e c9 1f f4  4d c4 7f f0 68 b0 69 97  |..P.^...M...h.i.|
000000a0  91 5d d3 de 7a ab 1c eb  e9 30 86 8d 81 c6 9a ae  |.]..z....0......|
000000b0  e5 fe 32 5c c0 f5 f2 96  ea e3 1f 90 24 8d 8a 46  |..2\........$..F|
000000c0  fd 3d 02 d2 af f6 f8 93  f7 21 0d 18 4b dd 2e f7  |.=.......!..K...|
000000d0  c5 c9 35 b7 54 7b 83 e8  4b e1 05 2c a7 83 76 81  |..5.T{..K..,..v.|
000000e0  9e d6 5c d5 8e 97 03 f3  89 15 51 c1 64 31 29 28  |..\.......Q.d1)(|
000000f0  e4 1a 6e 31 d4 c0 77 83  d5 d9 86 a4 ba 10 f9 0f  |..n1..w.........|
00000100  b3 80 ea 0b 2b f5 27 f0  b1 6e 7e 2e 61 5a d9 2a  |....+.'..n~.aZ.*|
00000110  c2 ad 28 b9 4c 84 76 3c  5c e1 b1 17 8a a6 36 6a  |..(.L.v<\.....6j|
00000120  5a c0 7d 75 a1 af f4 cb  a8 10 e2 e7 74 76 6f 9f  |Z.}u........tvo.|
00000130  62 4a 29 3c 1a ee 1d f3  a5 14 59 1d d4 6a aa f9  |bJ)<......Y..j..|
00000140  bd 37 51 af 73 0e b7 de  75 34 99 65 3d db 32 9c  |.7Q.s...u4.e=.2.|
00000150  73 e0 70 a0 61 a9 b2 44  0a fe 6c af e0 35 3f c7  |s.p.a..D..l..5?.|
00000160  4a a1 28 44 41 bb 34 c9  25 3e f3 3c e9 92 3e 93  |J.(DA.4.%>.<..>.|
00000170  d2 17 50 90 ac fb f6 3e  96 03 0a                 |..P....>...|
0000017b

mdrahman-suse avatar Aug 08 '22 17:08 mdrahman-suse

Closing this out as it's been validated. @mdrahman-suse let me know if there was a reason to keep it opened.

cwayne18 avatar Aug 30 '22 14:08 cwayne18