containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[EKS] [request]: Add/Delete/Update Subnets Registered with the Control Plane

Open christopherhein opened this issue 5 years ago • 95 comments

Tell us about your request The ability to update the Subnets that the EKS Control plane is registered with.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? https://twitter.com/julien_fabre/status/1099071498621411329

Are you currently working around this issue?

Additional context

christopherhein avatar Feb 23 '19 00:02 christopherhein

/cc @Pryz

christopherhein avatar Feb 23 '19 01:02 christopherhein

This would be a nice improvement

jmeichle avatar Feb 23 '19 01:02 jmeichle

To add some color, here are some use cases :

  1. You have a multi-tenant cluster configured with X number of subnets but you are getting close to IP exhaustion and you want to extend the setup to Y more subnets. Without losing the current configuration of course.

  2. You are expending your setup to new availability zones and so want to use new subnets to schedule PODs there.

  3. You were using your cluster on private subnets only and now want to extend to use some public subnets.

Generally, in many environment, network setup are moving and EKS need to be flexible enough to embrace such changes.

Thanks !

Pryz avatar Feb 23 '19 01:02 Pryz

@Pryz Your worker nodes don't have to be in the same subnets that your control plan is configured for. The latter are used for creating the ENIs that are used for kubectl log|exec|attach and for ELB/NLB placement.

alfredkrohmer avatar Feb 23 '19 11:02 alfredkrohmer

@devkid yes but that's a problem. You can basically schedule PODs on subnets which are not configured on the master but can't access it (logs, proxy, whatever).

Pryz avatar Mar 04 '19 18:03 Pryz

@Pryz If you have proper routing between the different subnet, this is not a problem. We have configured our control plane for one set of subnets and our workers run in a second, disjoint set of subnets and logs, proxy, exec are working just fine.

alfredkrohmer avatar Mar 04 '19 18:03 alfredkrohmer

@Pryz Could you explain detail how to routing between the different subnets? I am just wondering how to access to disjoint set of subnets. I hope your reply! Thanks

noahingh avatar Apr 10 '19 01:04 noahingh

@hanjunlee I'm not sure I understand your question. My setup is quite simple : 1 VPC, up to 5 CIDRs with 6 subnets each (3 private subnets, 3 public subnets). Each AZ gets 2 routing tables (1 for private subnets and 1 for public subnets). There is no issue in this routing setup. Any IP from a subnet can talk with any other IP from any of the other subnets.

Pryz avatar Apr 10 '19 05:04 Pryz

I am seeing the same issue. I have configured 2 subnets initially but my CIDR range was too small for IP assignemts from the cluster. So I added new subnets to the VPC and the worker nodes are running fine in these new subnets.

When using kubectl proxy and accessing the URL I get an error of :

Error: 'Address is not allowed' Trying to reach: 'https://secondary-ip-of-worker-node:8443/'

Control Pane ENI in old subnets, worker nodes in new subnets and kubectl host have all in and outbound rules for each other. I would think this is related to the issue of new subnets not having an attached ENI for the control pane. Any help would be appreciated.

aschonefeld avatar May 03 '19 14:05 aschonefeld

@aschonefeld did you tag the new subnets with the correct kubernetes.io/cluster/<clustername> tag set to shared?

I could not reproduce the error when I tagged the subnet that way. Spun up a node in the subnet and started a pod on that node. Afterwards I could run kubectl logs|exec|port-forward on that Pod.

ckassen avatar May 24 '19 15:05 ckassen

@ckassen the cloud formation temlate for the worker nodes tagged the subnet to shared. Kubectl logs for example is also working for me but proxy to the dashboard is not working. Command goes through but no connect to the dashboard is possible. Is kubectl. proxy working for you?

aschonefeld avatar May 28 '19 07:05 aschonefeld

If you decide your load balancers are in the wrong subnet, and decide to create new ones, as far as I can tell EKS doesn't detect the new subnets, and still creates the load balancers in the old subnets, even though they're no longer tagged with kubernetes.io/role/internal-elb. Being able to add the new subnets to EKS would be useful.

willthames avatar May 31 '19 01:05 willthames

Any work-around known so far? Terraform wants to create a new EKS cluster for me after adding new subnets :-/

thomasjungblut avatar Aug 06 '19 14:08 thomasjungblut

@thomasjungblut are you using this module? https://github.com/terraform-aws-modules/terraform-aws-eks the cluster aws_eks_cluster.this resource wants to be replaced?

sjmiller609 avatar Aug 06 '19 21:08 sjmiller609

We built our own module, but effectively it's the same TF resource that wants to be replaced yes.

It makes sense, since the API doesn't support changing the subnets: https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html

Would be cool to at least have the option of adding new subnets. The background is that we want to switch from public to private subnets, so we could add our private subnets on top of the existing public ones and just change the routes a bit. Would certainly make our life a bit easier :)

thomasjungblut avatar Aug 07 '19 06:08 thomasjungblut

We just ran into this exact issue, using that EKS TF Module too. A workaround that seems to work:

  1. Create the new subnets, setup the routes, etc
  2. Manually edit the ASG for the worker nodes, add the subnets
  3. Edit the control plane SG, add the CIDR ranges of the new subnets

This, of course, breaks running TF with the EKS module for that cluster again. We're hoping to mitigate that by tightening up the TF code so we can just create new properly sized VPC/subnets, and kill the old EKS cluster entirely

We're trying to make a custom TF module that will do the above work, without using the EKS module, so at least we can apply that programatically in the future if needed and that cluster is still around

theothermike avatar Aug 08 '19 14:08 theothermike

We are having the same problem. We needed to add new subnets to the EKS cluster and we had to rebuild it since the aws eks update-cluster-config --resources-vpc-config does not allow to update Subnet nor Security Groups once the cluster has been built.

An error occurred (InvalidParameterException) when calling the UpdateClusterConfig operation: subnetIds and securityGroupIds cannot be updated.

jahzielHA avatar Aug 30 '19 08:08 jahzielHA

Do we really need to rebuild the entire cluster to add a subnet?

flythebluesky avatar Sep 19 '19 12:09 flythebluesky

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

mailjunze avatar Oct 22 '19 07:10 mailjunze

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround didn't work for us :( We created new subnets, tagged them with kubernetes.io/cluster/eks: shared tag and ran the EKS upgrade with no changes to the EKS attached subnets. Anything we missed?

henkka avatar Oct 24 '19 13:10 henkka

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade. Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. > Upgrade the cluster, new tagged subnets will be discovered automatically.

The same as @henkka , The workaround didn't work for me too.
what should I do?

qcu266 avatar Nov 11 '19 08:11 qcu266

@QCU Did you modify the control plane security groups with the new CIDRs?

BertieW avatar Nov 29 '19 09:11 BertieW

@qcu Did you modify the control plane security groups with the new CIDRs?

^-- @QCU266: this was for you.

casutherland avatar Dec 02 '19 01:12 casutherland

The process that worked for us using terraform: Created new subnet(s) Tagged new subnets with kubernetes.io/cluster/ tag set to shared -- our subnets share the same route table, but if they didn't, we'd have tagged that too. Modify security group for the control plane to add the new CIDR. Cluster schedules pods just fine, and stuff like logs|proxy|etc work with no issue.

We use the EKS terraform module, and all of this was duable with terraform. The worker node block will happily accept a subnet that isn't one of those declared with the cluster initially. No manual changes required.

Assuming that the assets are properly tagged, I'd venture that the kubectl issues encountered above are down to SG configuration, not inherently EKS-related.

BertieW avatar Dec 11 '19 07:12 BertieW

Is there any plan for this proposal? The proposal is created for almost a year and AWS seems doesn't have any plan for it.

khacminh avatar Jan 15 '20 04:01 khacminh

@BertieW using the TF module with the new subnets added to the worker groups "forces replacement" of the EKS cluster.

All subnets (w/ tags), are added to the VPC and SG's, and they share same route table and are within the same VPC. As @thomasjungblut pointed out, the TF module appears to be restricted by the AWS API limitations and cannot add the additional subnets to the existing cluster. I don't see anyway to get around replacing the EKS cluster if deploying from the TF module :(

cazter avatar Feb 13 '20 18:02 cazter

Alright found a solution for the TF EKS module.

You can't make changes to the module subnets parameter. So if you were calling these subnets via a variable as I was like so:

subnets = module.vpc.private_subnets

You'll need to provide a static definition for precisely those subnets used when creating your cluster, example:

subnets = ["subnet-085cxxxx", "subnet-0a60xxxx", "subnet-0be8xxxx"]

Then within your worker groups you can add your new subnets. Afterwards you'll be able to apply the TF without forcing a cluster replacement.

cazter avatar Feb 13 '20 19:02 cazter

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround is not working anymore. The subnets chosen while creating the EKS cluster will be used by control plane nodes only. However, users can use different subnets for workerNodes. EKS will be able to register worker nodes on different subnets which were not part of initial set of subnets when the cluster was created.

You may attach a new CIDR range to the VPC and carve subnets out of newly created CIDR range and add tags to it to make sure EKS discovers those new subnets.

Worker nodes can be launched in new CIDR range subnets in the same VPC after cluster creation without any issues.

mailjunze avatar Mar 05 '20 09:03 mailjunze

+1

sbonasu avatar Mar 19 '20 16:03 sbonasu

+1

chunsli avatar Mar 30 '20 08:03 chunsli