add support for spot instances in AKS
What type of PR is this?
/kind feature
What this PR does / why we need it:
- Adds support for spot instances node pool for AKS.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes part of #1701
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
- [ ] squashed commits
- [ ] includes documentation
- [ ] adds unit tests
Release note:
add support for spot node pool in AKS
Hi @zmalik. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/ok-to-test
The committers are authorized under a signed CLA.
- :white_check_mark: Zain Malik (388fa2c4928b725fa2f802c009175fd9cadce3af)
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
:memo: Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.
It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
- If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
- If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
- If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
- Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/retest
@CecileRobertMichon: GitHub didn't allow me to request PR reviews from the following users: meixingdb.
Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.
In response to this:
/lgtm /cc @meixingdb
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: To complete the pull request process, please ask for approval from cecilerobertmichon after the PR has been reviewed.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
rebased and squashed
/area managedclusters
@zmalik can you please check why test is failing on this one when you get a chance?
The committers are authorized under a signed CLA.
- :white_check_mark: Zain Malik (388fa2c4928b725fa2f802c009175fd9cadce3af, 757870fc2cfdf81e45222e1a75e518d12d07f1da)
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
:memo: Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.
It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
- If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
- If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
- If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
- Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
rebased to bring all fixes from main
/retest
will rebase after https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/1927 lands as it will require another rebase after that
will pick this PR now, that taints and label PRs have been merged
@zmalik is on 🔥
failed with No AKS versions found for v1.19
will try to bump in this PR just to test the fix
that fixed it
New changes are detected. LGTM label has been removed.
removed the AKS version bump as https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2108 merged
added to e2e tests a spot agentpool
/test pull-cluster-api-provider-azure-e2e-windows-dockershim
/test pull-cluster-api-provider-azure-e2e
@zmalik: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-cluster-api-provider-azure-e2e-windows-dockershim | ef1563786fd90e4c5fa15fbfcf7178166eea3680 | link | true | /test pull-cluster-api-provider-azure-e2e-windows-dockershim |
| pull-cluster-api-provider-azure-ci-entrypoint | ef1563786fd90e4c5fa15fbfcf7178166eea3680 | link | false | /test pull-cluster-api-provider-azure-ci-entrypoint |
| pull-cluster-api-provider-azure-e2e | ef1563786fd90e4c5fa15fbfcf7178166eea3680 | link | true | /test pull-cluster-api-provider-azure-e2e |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
after doing more experiments looks like AKS returns errors when setting from empty scalesetpriority to Regular
Changing property 'properties.ScaleSetPriority' is not allowed
This is an issue for all existing nodepools that were created through old versions of CAPZ. which means we will need to make it by default empty also.
@zmalik can we use *string to our advantage anywhere to be able to differentiate between empty string and actually non-existent?
@zmalik can we use *string to our advantage anywhere to be able to differentiate between empty string and actually non-existent?
yes, we can do that. But can that be confusing for users?
if it's nil --> we don't set any default value? (as that would be case for users upgrading from old version of CAPZ)
if it's not nil but empty --> we set a default value of Regular as by design in CAPZ we don't allow empty string for scalesetpriority?
was that the proposal or reverse?
@zmalik Yeah that's my thinking, though I'm just driving by and haven't looked at all the codepaths yet!
The reason to think hard about figuring out some solution of setting a capz default value is to avoid unexpected changes in the future when AKS decides to change its opinion of what should be default.
yeah, I agree.
Mostly I'm thinking in case we introduce some default and user has to set empty string to trigger the default it might not be intuitive. As with nil value the old and new will still bypass any defaulting.
Can we ask AKS team if changing from "" to Regular in Azure API should be allowed as effectively "" means Regular and the error message Changing property 'properties.ScaleSetPriority' is not allowed is purely an API error as user is not changing the instance type but setting the ScaleSetPriority which it already has.