cloud-provider-aws icon indicating copy to clipboard operation
cloud-provider-aws copied to clipboard

Support multiple route tables

Open DockToFuture opened this issue 3 years ago • 4 comments
trafficstars

What would you like to be added:

I would like to be added support for multiple route tables.

Why is this needed:

This is needed in order to create an aws kubernetes cluster with multiple zones when the routing is done without an overlay. For each zone I have a separate route table and when I try to set the cluster up ccm returns the following error:

E0719 13:12:31.260618       1 route_controller.go:119] Couldn't reconcile node routes: error listing routes: found multiple matching AWS route tables for AWS cluster: shoot--core--aws-no-ov

After checking the code I saw that only one route table is supported by cloud controller manager. It would be great if support for multiple route tables could be added. The setup works on all other cloud providers but not on aws due to the missing support of this feature.

/kind feature

DockToFuture avatar Jul 19 '22 13:07 DockToFuture

@DockToFuture: This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jul 19 '22 13:07 k8s-ci-robot

Sounds reasonable at first glance, but could I get you to provide more information about your desired set-up so I can play around and understand the limitation? What CNI are you using? What are you using to set up the cluster?

nckturner avatar Aug 16 '22 17:08 nckturner

We (@DockToFuture and me) are using Gardener to create the kubernetes clusters. It uses the Gardener extension provider AWS to setup the infrastructure. You can find the terraform template here. Essentially, there is a cluster-global route table per cluster, which includes the kubernetes node cidr as local and the internet gateway as default route, and per availability zone there is another route table, which includes the kubernetes nodes cidr as local and a nat gateway as default gateway. We use calico or cilium as CNIs, with calico being used in the majority of scenarios. As of now, we use overlay networks (IP-in-IP for calico and vxlan for cilium), but would like to get rid of them. Therefore, our goal is to use the cloud controller manager to create the corresponding routes for the per node pod cidrs in the infrastructure. It works for all other infrastructures we work on, e.g. Azure, GCP and OpenStack.

ScheererJ avatar Aug 22 '22 07:08 ScheererJ

@DockToFuture pinging on this. Any way we can get a networking diagram for how Gardener requires underlying connectivity to look like?

jaypipes avatar Oct 14 '22 16:10 jaypipes

Hello @jaypipes , You can see an example diagram for one of the multizonal clusters we create in the following picture.

graph

Compared to the standard EKS Cluster a gardener cluster has the following setup:

  • 3 subnets per zone. This is the equivalent to the public and private subnets(called nodes_ in gardener) for an EKS cluster, with the addition of one more for internal loadbalancing purposes (called private_ in gardener).
  • One route table (named main) for all the public subnets. Public subnets have a NATGW and a route to the IGW.
  • nodes subnets get their own route table with a route to their respective zones NAT GW. The setup so far is very similar to what we see EKS to be using.

The main differences are:

  1. we are not confined to use the AWS CNI only, but also support Calico and Cilium as @ScheererJ mentioned.
  2. the pod CIDR ranges we use are not necessarily part of the VPC range.

Because of these differences, we cannot rely on native VPC routing (e.g like the secondary IPs for instances used by EKS) and we need each of the route tables used by the nodes subnets to be extended with routes that forward the traffic targeting specific pod IP ranges to their assigned node. From what we can see, any setup that aims to provide multi-zonal support in AWS needs to use multiple subnets and by extension these subnets will need their own route table. This is why we think that it is a good idea to extend the already existing capability of the AWS CCM to properly work for these multi-zone setups.

kon-angelo avatar Oct 26 '22 09:10 kon-angelo

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 24 '23 10:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 23 '23 10:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 25 '23 11:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 25 '23 11:03 k8s-ci-robot