eks-anywhere
eks-anywhere copied to clipboard
Long names of `Cluster` (and other) resources cause bootstrapping cluster to fail creation with generic error
What happened:
Attempting to bootstrap an EKS Anywhere cluster using a (very) long name for the Cluster
(and other) resources causes the kind
bootstrap cluster to fail creating.
Amongst the plethora of logs, these snippets seem to be most relevant
[apiclient] All control plane components are healthy after 22.003180 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
...
I0115 20:11:27.767218 147 uploadconfig.go:131] [upload-config] Preserving the CRISocket information for the control-plane node
I0115 20:11:27.767232 147 patchnode.go:31] [patchnode] Uploading the CRI Socket information "unix:///run/containerd/containerd.sock" to the Node API object "diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane" as an annotation
...
I0115 20:13:27.771568 147 round_trippers.go:553] GET https://diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane:6443/api/v1/nodes/diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane?timeout=10s 404 Not Found in 2 milliseconds
I0115 20:13:27.775214 147 round_trippers.go:553] GET https://diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane:6443/api/v1/nodes/diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane?timeout=10s 404 Not Found in 2 milliseconds
...
nodes "diamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane" not found
Error writing Crisocket information for the control-plane node
What you expected to happen: Bootstrap cluster creates successfully.
In the alternative, (and if my guess is correct regarding the character length of the Cluster
being the root cause), then some more helpful error message regarding the length of names of (certain) Kubernetes
object should be shown. In such a case, it is probably fair to expect that eksctl anywhere
validates the Cluster
(and other resources) name length before actually creating the bootstrap cluster.
How to reproduce it (as minimally and precisely as possible):
- Try to bootstrap an EKS Anywhere cluster (using almost default configuration) against vSphere Provider (might be reproducable with our providers as well, as the problem manifests itself when creating the
KIND
bootstrap cluster). - Make sure to use a very long name for the
Cluster
(and similarly for the other resources, includingVSphereMachineConfig
,VSphereDatacenterConfig
). I used a name with 36 characters.
Renaming the Cluster
object down to a more sensible 20-character string (and similarly for VSphereMachineConfig
, VSphereDatacenterConfig
resources) fixed the problem.
Anything else we need to know?:
- Cluster name (obfuscated, but identical length and character layout):
diamond-foo22-prod-eksa-mgmt-cluster
- (As evident in the last few lines of the logs shown above), the longe
Cluster
name causedeksctl anywhere
to name the control plane node groupdiamond-foo22-prod-eksa-mgmt-cluster-eks-a-cluster-control-plane
(which is 64 characters in length)
Environment:
- EKS Anywhere Release:
v0.18.4
- EKS Distro Release:
bottlerocket-v1.28.4-eks-d-1-28-12-eks-a-56-amd64
- Operating System: Fresh Ubuntu 22.04 VM
Note:
I am not entirely sure if this is the correct place to file this bug; but it seemed a good place to start. If the community feels reporting this to kind
or kubeadm
makes more sense, I'll be happy to do so.
Hello, thanks for the report. We will try replicate internally and get back.
Thanks for reporting @dejarikra! We were able to reproduce the issue on our end as well by setting the EKS-A Cluster
resource's name to exactly 36 characters long. On the EKS-A side, we add a suffix -eks-a-cluster
to get the KinD cluster's name (which makes it 50 characters long) and on top of this, KinD also adds a -control-plane
suffix to the cluster name to arrive at the name for the control plane node (container), which makes the control plane node name 64 characters in length.
The interesting part is Kind themselves allow a maximum cluster name length of 50. I tried creating a cluster with 37 characters in length and rightfully hit this warning, followed by a different error when creating the control-plane container which occurs because docker run
is invoked with the --hostname
parameter which is supplied a 65-character long hostname which doesn't conform to sethostname
's 64 character restrictions.
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: sethostname: invalid argument: unknown.
Based on some reading, I reached the conclusion that a maximum of 64 characters is allowed by both Kind and sethostname
so the issue may be something in kubeadm
or Kubernetes itself. Will dig further and update this issue if I find something.