aws-load-balancer-controller icon indicating copy to clipboard operation
aws-load-balancer-controller copied to clipboard

add sagemaker-hyperpod compute type to resolve its pods via VPC ENI

Open amber-liu-amzn opened this issue 1 year ago • 9 comments

Issue

Sagemaker HyperPod offers service-managed Kubernetes nodes accessible from customer accounts. Using aws-load-balancer-controller in HyperPod EKS clusters is not supported today, because nodes are in SageMaker VPC while load balancers will be in customer VPC.

Why is it not working today for routing traffic directly to the HyperPod's pod IP? SageMaker HyperPod pods are in a different VPC, but LBC incorrectly maps them as EC2 pods in customer VPC, leading to incorrect ENI info retrieval and missing security group permissions.

Description

To enable IP target mode for pods running on SageMaker HyperPod, this PR is to add sagemaker-hyperpod as a new compute type to resolve its pods via VPC ENI.

Checklist

  • [x] Added tests that cover your change (if possible)
  • [ ] Added/modified documentation as required (such as the README.md, or the docs directory)
  • [x] Manually tested
  • [x] Made sure the title of the PR is a good description that can go into the release notes

BONUS POINTS checklist: complete for good vibes and maybe prizes?! :exploding_head:

  • [ ] Backfilled missing tests for code in same general area :tada:
  • [ ] Refactored something and made the world a better place :star2:

amber-liu-amzn avatar Oct 10 '24 19:10 amber-liu-amzn