kops icon indicating copy to clipboard operation
kops copied to clipboard

Autoselect Zones Request

Open pluttrell opened this issue 9 years ago • 5 comments

One HA best practice when deploying at AWS is to run clusters across AZs. Kops currently supports this by manually passing the AZs you want to use through the --zones and --master-zones args. This requires that users manually determine which AZs, which slows down installations and provides a potential for user error or confusion. We should definitely keep this mechanism in place, but I propose that we also add another mechanism where Kops auto selects which zones to use.

Here's one idea for implementation:

Add these two args: --region and --zone-span.

  • If --zone-span isn't present default it to 3.
  • If --zones and --master-zones are both absent and --region is present, then automatically determine which zones to use.
  • So --region us-east-1 --zone-span 3 would effectively yield something like the following under the covers: --zones=us-east-1b,us-east-1c,us-east-1d --master-zones=us-east-1b,us-east-1c,us-east-1d.
  • Not all AWS users have access to all regions, so you'll need to query the AWS. One such way is to use the CLI as follows: aws ec2 describe-availability-zones and then parse the JSON such as with jq (jq '.AvailabilityZones | .[] | .ZoneName' --raw-output).

pluttrell avatar Feb 23 '17 20:02 pluttrell

It is a good idea. The challenge is that not all zones have all functionality. I don't know how widespread this is, but my us-east-1a can't create subnets, for example. AFAIK there is no way to know this other than to try to create a subnet and have it fail. Here is a good discussion of the issue: https://github.com/deis/deis/issues/1601

justinsb avatar Feb 24 '17 06:02 justinsb

Interesting challenge. Since Kops will fail to build a working k8s environment if one manually specifies an incompatible zone or it's automatically determined, I think that problem is separate and should be addressed separately, perhaps even first. I just created a separate issue #1996 to discuss that.

Also it's interesting that your AWS account can't create subnets in us-east-1a. Mine can. Here's one I created just now to double check:

$kops validate cluster test.my-domain.com
Validating cluster test.my-domain.com

INSTANCE GROUPS
NAME			ROLE	MACHINETYPE	MIN	MAX	SUBNETS
master-us-east-1a	Master	t2.large	1	1	us-east-1a
master-us-east-1b	Master	t2.large	1	1	us-east-1b
master-us-east-1c	Master	t2.large	1	1	us-east-1c
nodes			Node	t2.large	3	3	us-east-1a,us-east-1b,us-east-1c

NODE STATUS
NAME				ROLE	READY
ip-172-20-105-233.ec2.internal	master	True
ip-172-20-121-107.ec2.internal	node	True
ip-172-20-37-147.ec2.internal	master	True
ip-172-20-47-184.ec2.internal	node	True
ip-172-20-68-20.ec2.internal	node	True
ip-172-20-88-30.ec2.internal	master	True

Your cluster test.my-domain.com is ready

Perhaps this is just an example of the differences between what different AWS accounts can do or have access too.

pluttrell avatar Feb 24 '17 20:02 pluttrell

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot avatar Dec 21 '17 12:12 fejta-bot

/remove-lifecycle stale /lifecycle frozen

pluttrell avatar Jan 15 '18 07:01 pluttrell

Perhaps this is just an example of the differences between what different AWS accounts can do or have access too.

AWS Availability zone identifiers are randomized. Given 2 AWS accounts, the us-east-1a availability zone in either account is not guaranteed to be the same physical location.

@justinsb I'd be curious to know when the AWS account you're using was created. Rather than iterating AZ's and creating a subnet in each, account creation date may be an easier way to determine if a kops cluster can be provisioned in an account.

thomasv314 avatar Jan 15 '18 17:01 thomasv314

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 31 '22 18:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 30 '22 19:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Dec 30 '22 20:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 30 '22 20:12 k8s-ci-robot