[Bug] Create Cluster Hanging when clusterEndpoints.publicAccess=false
When disabling public access with clusterEndpoints.publicAccess=false, eksctl doesn't finish executing. Is this expected behavior?
2022-07-08 17:17:09 [ℹ] eksctl version 0.99.0
2022-07-08 17:17:09 [ℹ] using region us-east-1
2022-07-08 17:17:09 [!] warning, having public access disallowed will subsequently interfere with some features of eksctl. This will require running subsequent eksctl (and Kubernetes) commands/API calls from within the VPC. Running these in the VPC requires making updates to some AWS resources. See: https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html for more details
2022-07-08 17:17:09 [ℹ] subnets for us-east-1a - public:192.168.0.0/19 private:192.168.64.0/19
2022-07-08 17:17:09 [ℹ] subnets for us-east-1b - public:192.168.32.0/19 private:192.168.96.0/19
2022-07-08 17:17:09 [ℹ] nodegroup "core-services" will use "" [AmazonLinux2/1.22]
2022-07-08 17:17:09 [ℹ] nodegroup "Airflow-Tasks-NodeGroup" will use "" [AmazonLinux2/1.22]
2022-07-08 17:17:09 [ℹ] using Kubernetes version 1.22
2022-07-08 17:17:09 [ℹ] creating EKS cluster "kalaffia" in "us-east-1" region with managed nodes
2022-07-08 17:17:09 [ℹ] 2 nodegroups (Airflow-Tasks-NodeGroup, core-services) were included (based on the include/exclude rules)
2022-07-08 17:17:09 [ℹ] will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2022-07-08 17:17:09 [ℹ] will create a CloudFormation stack for cluster itself and 2 managed nodegroup stack(s)
2022-07-08 17:17:09 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-1 --cluster=kalaffia'
2022-07-08 17:17:09 [ℹ] Kubernetes API endpoint access will use provided values {publicAccess=false, privateAccess=true} for cluster "kalaffia" in "us-east-1"
2022-07-08 17:17:09 [ℹ] configuring CloudWatch logging for cluster "kalaffia" in "us-east-1" (enabled types: api, audit, authenticator, controllerManager, scheduler & no types disabled)
2022-07-08 17:17:09 [ℹ]
2 sequential tasks: { create cluster control plane "kalaffia",
2 sequential sub-tasks: {
5 sequential sub-tasks: {
wait for control plane to become ready,
update CloudWatch log retention,
associate IAM OIDC provider,
10 parallel sub-tasks: {
create IAM role for serviceaccount "eso/eso-service-account",
create IAM role for serviceaccount "fdw/postgres-service-account",
create IAM role for serviceaccount "datadog/datadog-cluster-agent",
create IAM role for serviceaccount "monitoring/grafana-service-account",
create IAM role for serviceaccount "services/annotation-review-service-account",
create IAM role for serviceaccount "support/mlflow-service-account",
create IAM role for serviceaccount "airflow/airflow-worker-service-account",
create IAM role for serviceaccount "services/graphql-service-account",
create IAM role for serviceaccount "external-dns/external-dns-service-account",
2 sequential sub-tasks: {
create IAM role for serviceaccount "kube-system/aws-node",
create serviceaccount "kube-system/aws-node",
},
},
restart daemonset "kube-system/aws-node",
},
2 parallel sub-tasks: {
2 sequential sub-tasks: {
create managed nodegroup "core-services",
propagate tags to ASG for managed nodegroup "core-services",
},
2 sequential sub-tasks: {
create managed nodegroup "Airflow-Tasks-NodeGroup",
propagate tags to ASG for managed nodegroup "Airflow-Tasks-NodeGroup",
},
},
}
}
2022-07-08 17:17:09 [ℹ] building cluster stack "eksctl-kalaffia-cluster"
2022-07-08 17:17:10 [ℹ] deploying stack "eksctl-kalaffia-cluster"
2022-07-08 17:17:40 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:18:10 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:19:10 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:20:11 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:21:11 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:22:11 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:23:11 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:24:12 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:25:12 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:26:12 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:27:13 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:28:13 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:29:13 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:30:14 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:31:14 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:32:14 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-08 17:33:14 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
I am writing this 20 minutes after the last log line, the CLI doesn't timeout or go on to the IAM or nodegroup stacks.
What were you trying to accomplish?
Cluster created, node groups created, IAM roles created
What happened?
Created cluster and then froze
Versions
$ eksctl info
eksctl version: 0.99.0
kubectl version: v1.24.1
OS: darwin
Hello @corinz 👋🏻 Seems like you are using the old eksctl version, have you tried it with the latest one?
@Himangini I upgraded to v105 -- now I am seeing a relevant error that seems like it has to do with the Kube API being unreachable. error creating Clientset: getting list of API resources for raw REST client: Get "https://902C19A7E34257BA6E220E9694B403EE.gr7.us-east-1.eks.amazonaws.com/api?timeout=32s": dial tcp 192.168.105.205:443: i/o timeout Expected behavior is that eksctl can handle the creation of a private cluster. What is eksctl attempting to do by reaching the kube api?
Full output:
eksctl create cluster \
-f ${EKSCTL_ENV_SUBST}.cluster.yaml \
--auto-kubeconfig
2022-07-11 09:34:27 [ℹ] eksctl version 0.105.0-dev+aa76f1d4.2022-07-08T14:38:11Z
2022-07-11 09:34:27 [ℹ] using region us-east-1
2022-07-11 09:34:27 [!] warning, having public access disallowed will subsequently interfere with some features of eksctl. This will require running subsequent eksctl (and Kubernetes) commands/API calls from within the VPC. Running these in the VPC requires making updates to some AWS resources. See: https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html for more details
2022-07-11 09:34:27 [ℹ] subnets for us-east-1a - public:192.168.0.0/19 private:192.168.64.0/19
2022-07-11 09:34:27 [ℹ] subnets for us-east-1b - public:192.168.32.0/19 private:192.168.96.0/19
2022-07-11 09:34:27 [ℹ] nodegroup "core-services" will use "" [AmazonLinux2/1.22]
2022-07-11 09:34:27 [ℹ] nodegroup "Airflow-Tasks-NodeGroup" will use "" [AmazonLinux2/1.22]
2022-07-11 09:34:27 [ℹ] using Kubernetes version 1.22
2022-07-11 09:34:27 [ℹ] creating EKS cluster "kalaffia" in "us-east-1" region with managed nodes
2022-07-11 09:34:27 [ℹ] 2 nodegroups (Airflow-Tasks-NodeGroup, core-services) were included (based on the include/exclude rules)
2022-07-11 09:34:27 [ℹ] will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2022-07-11 09:34:27 [ℹ] will create a CloudFormation stack for cluster itself and 2 managed nodegroup stack(s)
2022-07-11 09:34:27 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-1 --cluster=kalaffia'
2022-07-11 09:34:27 [ℹ] Kubernetes API endpoint access will use provided values {publicAccess=false, privateAccess=true} for cluster "kalaffia" in "us-east-1"
2022-07-11 09:34:27 [ℹ] configuring CloudWatch logging for cluster "kalaffia" in "us-east-1" (enabled types: api, audit, authenticator, controllerManager, scheduler & no types disabled)
2022-07-11 09:34:27 [ℹ]
2 sequential tasks: { create cluster control plane "kalaffia",
2 sequential sub-tasks: {
5 sequential sub-tasks: {
wait for control plane to become ready,
update CloudWatch log retention,
associate IAM OIDC provider,
10 parallel sub-tasks: {
create IAM role for serviceaccount "eso/eso-service-account",
create IAM role for serviceaccount "fdw/postgres-service-account",
create IAM role for serviceaccount "datadog/datadog-cluster-agent",
create IAM role for serviceaccount "monitoring/grafana-service-account",
create IAM role for serviceaccount "services/annotation-review-service-account",
create IAM role for serviceaccount "support/mlflow-service-account",
create IAM role for serviceaccount "airflow/airflow-worker-service-account",
create IAM role for serviceaccount "services/graphql-service-account",
create IAM role for serviceaccount "external-dns/external-dns-service-account",
2 sequential sub-tasks: {
create IAM role for serviceaccount "kube-system/aws-node",
create serviceaccount "kube-system/aws-node",
},
},
restart daemonset "kube-system/aws-node",
},
2 parallel sub-tasks: {
2 sequential sub-tasks: {
create managed nodegroup "core-services",
propagate tags to ASG for managed nodegroup "core-services",
},
2 sequential sub-tasks: {
create managed nodegroup "Airflow-Tasks-NodeGroup",
propagate tags to ASG for managed nodegroup "Airflow-Tasks-NodeGroup",
},
},
}
}
2022-07-11 09:34:27 [ℹ] building cluster stack "eksctl-kalaffia-cluster"
2022-07-11 09:34:28 [ℹ] deploying stack "eksctl-kalaffia-cluster"
2022-07-11 09:34:58 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:35:28 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:36:28 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:37:29 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:38:29 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:39:29 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:40:29 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:41:30 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:42:30 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:43:30 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:44:30 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:45:31 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:46:31 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:47:31 [ℹ] waiting for CloudFormation stack "eksctl-kalaffia-cluster"
2022-07-11 09:48:02 [!] 1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
2022-07-11 09:48:02 [ℹ] to cleanup resources, run 'eksctl delete cluster --region=us-east-1 --name=kalaffia'
2022-07-11 09:48:02 [✖] error creating Clientset: getting list of API resources for raw REST client: Get "https://902C19A7E34257BA6E220E9694B403EE.gr7.us-east-1.eks.amazonaws.com/api?timeout=32s": dial tcp 192.168.105.205:443: i/o timeout
According to the documentation:
- EKS does allow creating a configuration that allows only private access to be enabled, but eksctl doesn't support it during cluster creation as it prevents eksctl from being able to join the worker nodes to the cluster.
It's a great automation feature if eksctl can support this use case just like fully private cluster where it turns off public endpoint at the end of the provision.
@corinz, eksctl must be run from within the same VPC (or via some other means like AWS Direct Connect) if public endpoint access is disabled, otherwise it cannot connect to the API server, eventually failing with a timeout error. Since you're letting eksctl create the VPC and other networking resources, your best bet is to create the cluster with public endpoint access enabled, and disable it post cluster creation with eksctl utils update-cluster-endpoints.
I have removed the bug label as this behaviour is documented and also logged.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
@corinz, eksctl must be run from within the same VPC (or via some other means like AWS Direct Connect) if public endpoint access is disabled, otherwise it cannot connect to the API server, eventually failing with a timeout error. Since you're letting eksctl create the VPC and other networking resources, your best bet is to create the cluster with public endpoint access enabled, and disable it post cluster creation with
eksctl utils update-cluster-endpoints.I have removed the bug label as this behaviour is documented and also logged.
Please let us know if this answers your question so we can close the issue.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
@cPu1 thank you! This can be closed
Sorry to comment again, it is possible to join the worker nodes to the cluster manually? I can config VPC peering/tgw after the creation of the cluster with private endpoint. Is there any method to join the worker nodes to the cluster ?
BTW, confirmed that adding new node group in the aws console of EKS will help join the orphan worker nodes created by eksctl to the cluster, but adding nodegroup with eksctl from the command line does not join the orphan worker nodes to the cluster.