microk8s
microk8s copied to clipboard
microk8s join failure - unable to verify the identity of the remote server
Summary
When building multi-node microk8s clusters, new node fail to join the initial seed node due to an SSL verification issue.
What Should Happen Instead?
New nodes should not have this issue.
Reproduction Steps
First node:
$ sudo microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 10.246.114.14:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
Use the '--worker' flag to join a node as a worker not running the control plane, eg:
microk8s join 10.246.114.14:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77 --worker
If the node you are adding is not reachable through the default interface you can use one of the following:
microk8s join 10.246.114.14:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.12.8:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.0.3:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.20.1:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.8.1:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.16.1:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
microk8s join 10.6.4.1:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
Second node :
$ sudo microk8s join 10.246.114.14:25000/f49cecee3a66113957fc76e6e9977619/3bf432255b77
Contacting cluster at 10.246.114.14
Joining cluster failed. Could not verify the identity of 10.246.114.14. Use '--skip-verify' to skip server certificate check.
Introspection Report
See attach - from seed node (node-gadomski).
openssl showcerts on join target: https://paste.ubuntu.com/p/yVrH2G6X3R/
Hi @javacruft can you run the microk8s add-node command a second time on the first node? Do you notice any changes in the fingerprint (last part of the join URL)?
I ran that command again:
$ sudo microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 10.246.114.14:25000/035dba01bdd43a7d8948971d07a830f1/af562a782d00
Use the '--worker' flag to join a node as a worker not running the control plane, eg:
microk8s join 10.246.114.14:25000/035dba01bdd43a7d8948971d07a830f1/af562a782d00 --worker
it has indeed changed
Prob worth mentioning we're using the 1.26-strict channel
I chased this down and it appears to be related to clustering when the no-cert-reissue lock file (/var/snap/microk8s/current/var/lock/no-cert-reissue) is in place. It essentially prevents the certs from being refreshed for the node which is joining the cluster - but this must be done in order to get new certs that match the cluster.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.