cluster-api-provider-azure
cluster-api-provider-azure copied to clipboard
add e2e for csi migration upgrade scenarios
Signed-off-by: Ashutosh Kumar [email protected]
What type of PR is this? e2e tests
/kind feature
What this PR does / why we need it: With Kubernetes 1.23, CSI migration for azure disk is enabled by default and is in beta in 1.23 ( GA in 1.24 ). This means that one can either opt to go with the default migration in 1.23 or not go with it by explicitly disabling it. Thus we can have following upgrade paths for 1.22 to 1.23. This PR validates the upgrade paths by adding e2e.
So an upgrade from, let us say from k8s v1.22 to v1.23 can mean any of the following combination:
| S.No | AzureDiskCSI | CloudProviderAzure | AzureDiskCSIMigration |
|---|---|---|---|
| 1 | In-tree | In-tree | Disabled |
| 2 | External | In-tree | Enabled |
| 3 | External | External | Enabled |
For the last combination in the table mentioned ( i.e 3 ) external-cloud-volume-plugin: azure flag needs to be set on controller manager for old Kubelets to keep talking to in-tree CCM and upgrade fails without this is set.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2242
Special notes for your reviewer: A next PR will be raised to document the flags to be used for different scenarios in case of upgrade from v1.22 to v.123 Adding reference of a similar PR in capa https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3267/files TODOs:
- [x ] squashed commits
- [ ] includes documentation
- [ ] adds unit tests
None
/retest
/test pull-cluster-api-provider-azure-e2e-optional
/test pull-cluster-api-provider-azure-e2e-optional
/retest
/test pull-cluster-api-provider-azure-e2e-optional
/test pull-cluster-api-provider-azure-e2e-optional
The test infra PR: https://github.com/kubernetes/test-infra/pull/26993
/test pull-cluster-api-provider-azure-e2e-csi-migration
@CecileRobertMichon -- I have added [k8s-upgrade] label and skipping the test if k8s version upgrade is not from 1.22 --> 1.23.
Question: Will this test even run in testgrid as I can see that upgrade from and upgrade to is pivoted to a particular stable release of k8s. https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/test/e2e/config/azure-dev.yaml#L150
@sonasingh46 it will run in the periodic jobs (https://testgrid.k8s.io/sig-cluster-lifecycle-cluster-api-provider-azure#capi-periodic-upgrade-main-1-22-1-23)
/test ls
@CecileRobertMichon: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:
/test pull-cluster-api-provider-azure-build/test pull-cluster-api-provider-azure-ci-entrypoint/test pull-cluster-api-provider-azure-e2e/test pull-cluster-api-provider-azure-test/test pull-cluster-api-provider-azure-verify
The following commands are available to trigger optional jobs:
/test pull-cluster-api-provider-azure-apidiff/test pull-cluster-api-provider-azure-apiversion-upgrade/test pull-cluster-api-provider-azure-capi-e2e/test pull-cluster-api-provider-azure-conformance/test pull-cluster-api-provider-azure-conformance-with-ci-artifacts/test pull-cluster-api-provider-azure-coverage/test pull-cluster-api-provider-azure-e2e-csi-migration/test pull-cluster-api-provider-azure-e2e-exp/test pull-cluster-api-provider-azure-e2e-optional/test pull-cluster-api-provider-azure-e2e-workload-upgrade/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts-serial-slow
Use /test all to run the following jobs that were automatically triggered:
pull-cluster-api-provider-azure-apidiffpull-cluster-api-provider-azure-buildpull-cluster-api-provider-azure-ci-entrypointpull-cluster-api-provider-azure-coveragepull-cluster-api-provider-azure-e2epull-cluster-api-provider-azure-testpull-cluster-api-provider-azure-verify
In response to this:
/test ls
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/test pull-cluster-api-provider-azure-e2e-workload-upgrade
/test pull-cluster-api-provider-azure-e2e-csi-migration
AKS has enabled AzureDiskCSIMigration and installed Azure Disk CSI driver by default from 1.21, shall we follow same way on capz?
@andyzhangx -- Can you elaborate more? What do you mean by follow same way in capz? Are you saying that all the workload cluster creation e2e test should install external azure disk csi driver? Or Are you saying that we should test upgrade from 1.20 --> 1.21 also as azure disk csi driver is enabled by default in AKS
AzureDiskCSIMigration
@sonasingh46 that depends on whether AzureDiskCSIMigration would be enabled on capz, if it's enabled on 1.23 (using default value), then Azure Disk CSI driver should be installed on capz 1.23 by default, otherwise there would be failure if user upgrade from 1.22 to 1.23 with in-tree disk PV.
Right. As far as capz is concerned, the external drivers are not installed by default and no flags are set by default on controller manager that expects external csi driver for non managed clusters.
But with 1.23 the migration flag is enabled by default on core k8s.
So I don't think we need the test for version prior to 1.22 --> 1.23
If we need to do something similar for AKS as it might have implicitly enabled the migration flag -- sure we can have a test for managed cluster ( i.e AKS ). But I do not what is that inflection version to which we should test. If I read it right -- you are saying for 1.21 k8s version in AKS -- the migration is made default-- so in this case 1.20 --> 1.21 upgrade for AKS make sense.
@sonasingh46 for capz, only 1.22 --> 1.23 upgrade scenario is needed. And actually you could run the in-tree azure disk e2e test in every capz version, those in-tree azure disk e2e tests should pass in any version no matter it has enabled AzureDiskCSIMigration flag is enabled or not, no matter it's 1.21, 1.22 or 1.23.
@andyzhangx -- Yes that makes sense. I can do that in a next PR as looks like this is bit out of scope of the current PR. wdyt?
This PR intends to test for a special inflection version.
/test pull-cluster-api-provider-azure-e2e-workload-upgrade
/test pull-cluster-api-provider-azure-e2e-csi-migration
@andyzhangx -- Yes that makes sense. I can do that in a next PR as looks like this is bit out of scope of the current PR. wdyt?
@sonasingh46 lgtm, thanks!
https://github.com/kubernetes/test-infra/pull/27261
/test pull-cluster-api-provider-azure-e2e-csi-migration
/retest pull-cluster-api-provider-azure-e2e-csi-migration
@sonasingh46: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:
/test pull-cluster-api-provider-azure-build/test pull-cluster-api-provider-azure-ci-entrypoint/test pull-cluster-api-provider-azure-e2e/test pull-cluster-api-provider-azure-test/test pull-cluster-api-provider-azure-verify
The following commands are available to trigger optional jobs:
/test pull-cluster-api-provider-azure-apidiff/test pull-cluster-api-provider-azure-apiversion-upgrade/test pull-cluster-api-provider-azure-capi-e2e/test pull-cluster-api-provider-azure-conformance/test pull-cluster-api-provider-azure-conformance-with-ci-artifacts/test pull-cluster-api-provider-azure-coverage/test pull-cluster-api-provider-azure-e2e-csi-migration/test pull-cluster-api-provider-azure-e2e-exp/test pull-cluster-api-provider-azure-e2e-optional/test pull-cluster-api-provider-azure-e2e-workload-upgrade/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts-serial-slow
Use /test all to run the following jobs that were automatically triggered:
pull-cluster-api-provider-azure-apidiffpull-cluster-api-provider-azure-buildpull-cluster-api-provider-azure-ci-entrypointpull-cluster-api-provider-azure-coveragepull-cluster-api-provider-azure-e2epull-cluster-api-provider-azure-testpull-cluster-api-provider-azure-verify
In response to this:
/retest pull-cluster-api-provider-azure-e2e-csi-migration
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/test pull-cluster-api-provider-azure-e2e-csi-migration
@sonasingh46 looks like the csi migration test is timing out, PTAL