containers-roadmap
containers-roadmap copied to clipboard
[EKS] [request]: VPC endpoint support for EKS API
Tell us about your request VPC endpoint support for EKS, so that worker nodes that can register with an EKS-managed cluster without requiring outbound internet access.
Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Worker nodes based on the EKS AMI run bootstrap.sh to connect themselves to the cluster. As part of this process, aws eks describe-cluster
is called, which currently requires outbound internet access.
I'd love to be able to turn off outbound internet access but still easily bootstrap worker nodes without providing additional configuration.
Are you currently working around this issue?
- Providing outbound internet access to worker nodes; OR
- Supplying the cluster CA and API endpoint directly to
bootstrap.sh
.
Additional context
- Relates somewhat to #22 & #221, but for the AWS EKS API rather than the Kubernetes control plane API
Is there any news on this?
Any updates on this issue?
If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster
API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.
Thanks Mike. Unfortunately managed nodes are not an option because they cannot be scaled to 0. We run some machine learning workloads that require scaling up ASGs with expensive VMs (x1.32xlarge) and we need to be able to scale them back to 0 once the workloads have completed.
Thanks for the feedback. Can you open a separate GH issue with that feature request for Managed Node Groups?
Will keep this issue open as it's something we are researching.
@mikestef9 I'm interested in the managed nodes solution. What do you mean by "you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly"?
Which PrivateLink endpoints are you referring to? Just the other service endpoints such as SQS and SNS that the applications running on the cluster may happen to use? Or do you mean that there are particular PrivateLink endpoints required to run EKS in private subnets with no internet gateway?
Hi @dsw88,
In order for the worker node to join the cluster, you will need to configure VPC endpoints for ECR, EC2, and S3
See this GH repo https://github.com/jpbarto/private-eks-cluster created by an AWS Solutions Architect for a reference implementation. Note that only 1.13 and above EKS clusters have a kubelet version that is compatible with the ECR VPC endpoint.
@mikestef9 Thanks so much for the info, and thanks for the pointer to the private EKS cluster reference repository!
I have one final question that I'm having a hard time figuring out how to deal with: How can I configure other hosts in this same private VPC to be able to talk to the cluster? Knowing the private DNS name isn't a huge deal, because I can just hard-code it into whatever needs to talk to the cluster. A bigger problem, however, is how a host in the private VPC can authenticate with the cluster.
Currently when I use the AWS API to set up a kubeconfig with EKS, it includes the following snippet in the generated kubeconfig file:
- name: arn:aws:eks:REGION:ACCOUNT_ID:cluster/CLUSTER_NAME
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
args:
- --region
- REGION
- eks
- get-token
- --cluster-name
- CLUSTER_NAME
command: aws
env: null
As you can see, it called the EKS API to get a token that authenticates it with the cluster. That definitely presents a problem since my hosts in the private VPC also don't have access to the EKS API. Is there another way that I can authenticate to the cluster without EKS API access?
See this GH repo https://github.com/jpbarto/private-eks-cluster created by an AWS Solutions Architect for a reference implementation. Note that only 1.13 and above EKS clusters have a kubelet version that is compatible with the ECR VPC endpoint.
It seems that this repo uses unmanaged nodes though. I tried deploying it and it brought up a cluster without any nodes listed under the EKS web console. Is this correct?
@mikestef9 Thank you very much for this clue. Now I have a working setup with managed worker groups and no access to the Internet :tada:
I was not sure if it's feasible as the documentation says:
Amazon EKS managed node groups can be launched in both public and private subnets. The only requirement is for the subnets to have outbound internet access. Amazon EKS automatically associates a public IP to the instances started as part of a managed node group to ensure that these instances can successfully join a cluster.
Well, apparently it is. If someone needs working Terraform recipes, ping me [email protected].
@vranystepan great to hear you have this working. As part of our fix for #607 we will make sure to get our documentation updated.
This is still a real issue.
I need to actually create and delete new clusters from private subnets with no NAT or Egress gateways. I can create private endpoints for apparently every AWS service but EKS. This is a a deep pain for some customers, as we have to build complicated workarounds to have traffic routed towards the EKS service, whereas every other AWS service is easily exposed with a private endpoint.
I agree with @duckie this issue should not be closed yet. EKS support is laughable.
I agree that VPC endpoints are still very important, and this issue should be kept open. It is possible to run EKS clusters in private subnets with no internet egress, but it is not possible to manage those clusters from within that private VPC. We are limited in the tooling we can develop around EKS for lifecycle actions such as creating, updating, and deleting clusters because we can't perform those actions inside our private VPC. Please consider implementing a VPC endpoint for EKS! Thanks!
Hi, Any workaround for this issue? We should able to create and manage EKS cluster in private VPC. In our situation (due to security policies), our bastion server (and vpc) don’t have public access. In that case how we can create an eks cluster? We are using Terraform to provision EKS.
Is there status on this issue? This is a real problem for vendors that only use the bootstrap.sh to perform automated eks deployments because our environment are private. I would like to know if anyone is working on this eks private endpoint? Thanks
We have the problem too. We've built a private cluster for a private vpc with CDK (the VPC is connected to a Transit Gateway). CDK makes usage of a custom resource lambda for doing the kubeconfig update. When the cluster endpointAccess is private (or public and private) this lambda is associated to the VPC (via ENIs). The Lambda function calls "aws eks update-kubeconfig" from "inside" of the VPC, but is unable to access the cluster endpoint and fails with a timeout. All necessary VPC Endpoints (according to the official EKS docs) are set (ecr.api, ecr.dkr, s3, ...,).
+1 Making fully private clusters that are custom cloud formation resources is actually not possible without this: a lambda in VPC cannot get kubectl tokens.
+1 For my case, I cannot use codebuild with attached VPC (all subnets are private) to call to the private EKS cluster via "aws eks update-kubeconfig"
The result would be
Connect timeout on endpoint URL: "https://eks.<region>.amazonaws.com/clusters/xxxxx"
when i create cluster with no internet access, getting below error... Is there any update on VPC endpoint support for EKS API?
Command used to create cluster:
aws eks create-cluster
--region ap-southeast-1
--name CP-EKS-TEST-NHSK
--kubernetes-version 1.21
--role-arn arn:aws:iam::4103:role/nhsk
--resources-vpc-config subnetIds=subnet-063b9,subnet-04,securityGroupIds=sg-03
Error Message: connect timeout on endpoint url: "https://eks.ap-southeast-1.amazonaws.com/clusters"
I need this as well. Is there a solution or a current workaround yet?
Commenting as well. An EKS VPC Endpoint would be a huge help. Have there been any updates recently?
@mikestef9
If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.
Mike, what are the "other required endpoints"? Is there a list somewhere that says, "here are all of the endpoints that a managed node requires"?
@mikestef9
If you use EKS Managed Nodes, the bootstrapping process avoids the aws eks describe-cluster API call, so you can launch workers into a private subnet without outbound internet access as long as you setup the other required PrivateLink endpoints correctly.
Mike, what are the "other required endpoints"? Is there a list somewhere that says, "here are all of the endpoints that a managed node requires"?
@deitch imho the folowing VPC endpoints are required :
- ecr.api with interface mode
- ecr.dkr with interface mode
- s3 with gateway mode. On this point you also need to configure the a new route to join s3 via this gateway.
Cool thanks. Are the ECR only if you use containers from ECR? Or general requirement?
This should be documented formally somewhere in AWS.
Cool thanks. Are the ECR only if you use containers from ECR? Or general requirement?
This should be documented formally somewhere in AWS.
Using EKS then ECR is required to bootstrap nodes. And because ECR stores images on S3 under-the-hood, you have to get access to S3. You can take a look at this documentation for EKS : https://docs.aws.amazon.com/eks/latest/userguide/private-clusters.html
Much appreciated.
Are there any updates on this team?
Cluster autoscaler, when running in a private EKS cluster, also experiences that problem:
managed_nodegroup_cache.go:133] Failed to query the managed nodegroup foo for the cluster bar while looking for labels/taints: RequestError: send request failed
caused by: Get "https://eks.<region>.amazonaws.com/clusters/bar/node-groups/foo": dial tcp <*public_IP*>:443: i/o timeout
After reading https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html
I think there could be a w/a to that: "DHCP options set for your VPC must include AmazonProvidedDNS in its domain name servers list". But I'm not sure which domain name to configure in dhcp options... Should it be eks.<region>.amazonaws.com
?
Amazon EKS now supports AWS PrivateLink for the EKS management APIs.
A few call outs:
-
VPC endpoint policies are not supported.
-
EKS support for AWS PrivateLink is available in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon, N. California), Africa (Cape Town), Asia Pacific (Hong Kong, Mumbai, Singapore, Sydney, Seoul, Tokyo), Canada (Central), Europe (Ireland, Frankfurt, London, Stockholm, Paris, Milan), Middle East (Bahrain), South America (Sao Paulo), AWS GovCloud (US), China (Beijing), and China (Ningxia).
- EKS API PrivateLink is not yet available in the following regions: Asia Pacific (Osaka), Asia Pacific (Jakarta), Middle East (UAE).
-
This is PrivateLink support for the EKS management APIs (createCluster etc), not the Kubernetes API endpoint of a cluster. EKS already supports a private endpoint for the Kubernetes API server, although it’s implemented in a different manner from PrivateLink (and we are aware of open feature request for the cluster private endpoint to be implemented as a standard PrivateLink endpoint).