karpenter icon indicating copy to clipboard operation
karpenter copied to clipboard

Support cross account NodePools

Open RobCannon opened this issue 7 months ago • 4 comments

Description

What problem are you trying to solve? I would like to support running nodes in another account so the compute can run in the same VPC as other resources in that account.

How important is this feature to you? I know that we can use VPC peering to reach across accounts, but it would be more efficient if the nodes were running in the same VPC. That also allows us to easily allocate costs for an application or sub-system to a single AWS account.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

RobCannon avatar Apr 09 '25 14:04 RobCannon

It sounds like you're aiming to manage multiple Kubernetes clusters from a central control plane. This concept is similar to the functionality offered by Cluster API (CAPI). To provide more tailored advice:

  • Could you elaborate on your current infrastructure setup and requirements?
  • What specific challenges are you trying to address with multi-cluster management?
  • Are there any particular features or capabilities you're looking for in a centralized management solution?

engedaam avatar Apr 09 '25 19:04 engedaam

/priority awaiting-more-evidence /triage needs-information

engedaam avatar Apr 09 '25 19:04 engedaam

We currently have a cluster in one account and use Karpenter to manage the nodes in that account. We support multiple applications in the cluster but we would like to have better partitioning for the resources for each application. We do have very specific IAM roles that are bound to each application that only let them access specific resources. But, ultimately, all of the resources for all applications are intermingled in one account.

One option is to create a cluster for each application. That increases costs since we have to duplicate core Kubernetes resources like ArgoCD, Crossplane and Kyverno. The basic cost overhead to run a cluster with no applications is around $3K/month. And my team now must manage more clusters. I am looking for an alternative.

We are starting to experiment with using a different account for each application and creating the AWS resources (like RDS, DynamoDB, S3, CloudFront) in the application specific account. This will require us to use VPC peering to connect the cluster account to all the application accounts. But this really doesn't really partition applications since ALL nodes in the cluster will require access to all VPCs. I am not even sure if this will work as we are just starting to experiment with this.

I think I can create NodeClasses in Karpenter that are bound to specific subnets/security groups in the cluster account. So, an application is bound to a NodePool/NodeClass combo that gives them access to a VPC peering and can reach the resources in the application account. That would create better partitioning, but I am slightly concerned that I am introducing a network hop.

A better version of this solution would be to have the EC2 instances run in the application account. This gets rid of the network hop between application pods and RDS. It also has the benefit that I can easily get the costs of running an application by just getting the costs for that application account. The only compute running in the cluster account would be to support the shared services.

There are plenty of issues to solve to make this possible, but it seems like a good goal. I would love to hear feedback.

RobCannon avatar Apr 10 '25 14:04 RobCannon

@engedaam Do you need more information?

RobCannon avatar Apr 30 '25 13:04 RobCannon

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 29 '25 14:07 k8s-triage-robot

I have same the request for the same reasons as @RobCannon has. But instead a distinct VPC for each account, we aim to work with shared VPCs.

jkroepke avatar Jul 31 '25 10:07 jkroepke

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 30 '25 10:08 k8s-triage-robot

/remove-lifecycle rotten

jkroepke avatar Aug 30 '25 11:08 jkroepke