cluster-api-provider-aws icon indicating copy to clipboard operation
cluster-api-provider-aws copied to clipboard

Fully private CAPA clusters

Open Skarlso opened this issue 3 years ago • 8 comments
trafficstars

/kind feature

Describe the solution you'd like [A clear and concise description of what you want to happen.]

This is a catch-all issue for everything that needs to be done related to making private clusters a reality.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api-provider-aws version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Skarlso avatar Sep 06 '22 07:09 Skarlso

We have an issue where we wanted to re-think the networking and offer different topologies: #1484

A few other issues that are related:

  • #3700
  • #2465
  • #2484
  • #2849
  • #3131

richardcase avatar Sep 06 '22 08:09 richardcase

/triage accepted /priority important-soon

richardcase avatar Sep 06 '22 08:09 richardcase

@Skarlso @richardcase would I be correct that this relates to both managed and unmanaged VPCs?

If so I know there's some current logic that requires a CAPA-managed VPC to have at least 1 public and 1 private subnet. I'm not sure what the context is here though is there a limitation elsewhere that requires this to be the case? (I'm guessing it's related to the management cluster being able to connect to the workload cluster but not sure if still valid).

Is it worth opening an issue to cover changing this requirement to fit in with the issue?

Edit: the getDefaultSubnets() function also forces the creation of a public subnet in each AZ

AverageMarcus avatar Sep 09 '22 12:09 AverageMarcus

@Skarlso @richardcase would I be correct that this relates to both managed and unmanaged VPCs?

Correct.

If so I know there's some current logic that requires a CAPA-managed VPC to have at least 1 public and 1 private subnet. I'm not sure what the context is here though is there a limitation elsewhere that requires this to be the case? (I'm guessing it's related to the management cluster being able to connect to the workload cluster but not sure if still valid).

Precisely. CAPA needs to maintain connection and refresh kube config for access. But this could be done via a private connection as well. As long as CAPA is in the same network I guess.

Also, CAPA when it's done creating the cluster (managed EKS), it needs to shut off the public endpoint of the cluster. I have no idea how that will work. But as long as CAPA is in the same network, it should be okay.

The need for public subnet must be removed.

Is it worth opening an issue to cover changing this requirement to fit in with the issue?

Sorry, I don't understand this sentence. Would you mind rephrasing it?

Skarlso avatar Sep 10 '22 06:09 Skarlso

Sorry, I don't understand this sentence. Would you mind rephrasing it?

Sorry, it'd been a long week 😅

What I mean is, should we have an issue to specifically cover removing the hardcoded requirement on having both public and private subnets. I imagine there's more needed than just removing the checks so would be good to collect that info somewhere.

Also, CAPA when it's done creating the cluster (managed EKS), it needs to shut off the public endpoint of the cluster. I have no idea how that will work. But as long as CAPA is in the same network, it should be okay.

This has reminded me, there also needs to be support for situation where the initial creation of public resources (subnets, NAT gateway) can be skipped entirely. Creating them during initial setup and then later removing them isn't going to work in strict environment where policies are enforced via SCPs. (We have customers with this requirement already)

AverageMarcus avatar Sep 10 '22 10:09 AverageMarcus

What I mean is, should we have an issue to specifically cover removing the hardcoded requirement on having both public and private subnets. I imagine there's more needed than just removing the checks so would be good to collect that info somewhere.

Ah gotcha. :D Yes, sure, we can track that separately. This ticket is meant to be something akin to an epic basically.

This has reminded me, there also needs to be support for situation where the initial creation of public resources (subnets, NAT gateway) can be skipped entirely. Creating them during initial setup and then later removing them isn't going to work in strict environment where policies are enforced via SCPs. (We have customers with this requirement already)

True. Sadly. :D That will be a lot of work. :D But , hmm.. maybe not, because NAT gateways are only created when using public networks so if we don't create those at all, we shouldn't create a nat gateway either, right?

Skarlso avatar Sep 10 '22 12:09 Skarlso

True. Sadly. :D That will be a lot of work. :D But , hmm.. maybe not, because NAT gateways are only created when using public networks so if we don't create those at all, we shouldn't create a nat gateway either, right?

In theory, yes.

I'll raise an issue for making the NAT gateway an optional resource for private network clusters and link back to here.

AverageMarcus avatar Sep 12 '22 06:09 AverageMarcus

What I mean is, should we have an issue to specifically cover removing the hardcoded requirement on having both public and private subnets. I imagine there's more needed than just removing the checks so would be good to collect that info somewhere.

Our long-standing open issue ( #1484 ) to re-think how we represent different network topologies could help with this.

richardcase avatar Oct 10 '22 15:10 richardcase

Also related to this: https://github.com/kubernetes-sigs/cluster-api/issues/6520

richardcase avatar Nov 02 '22 20:11 richardcase

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 31 '23 21:01 k8s-triage-robot

/remove-lifecycle stale

Skarlso avatar Feb 01 '23 08:02 Skarlso

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged. Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Deprioritize it with /priority important-longterm or /priority backlog
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar May 02 '23 08:05 k8s-triage-robot

/remove-lifecycle stale

Skarlso avatar May 02 '23 09:05 Skarlso

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 19 '24 11:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 18 '24 12:02 k8s-triage-robot

/help /priority important-soon

richardcase avatar Jul 18 '24 10:07 richardcase

@richardcase: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help /priority important-soon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 18 '24 10:07 k8s-ci-robot