origin
origin copied to clipboard
ETCD-565: add manual etcd signer cert rotation e2e test
This PR adds a suite of tests related to rotation of etcd certificates.
@tjungblu: This pull request references ETCD-565 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@tjungblu: This pull request references ETCD-565 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
In response to this:
This PR adds a suite of tests related to rotation of etcd certificates.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/test ?
@tjungblu: The following commands are available to trigger required jobs:
/test e2e-aws-jenkins/test e2e-aws-ovn-fips/test e2e-aws-ovn-image-registry/test e2e-aws-ovn-serial/test e2e-gcp-ovn/test e2e-gcp-ovn-builds/test e2e-gcp-ovn-image-ecosystem/test e2e-gcp-ovn-upgrade/test e2e-metal-ipi-ovn-ipv6/test images/test lint/test unit/test verify/test verify-deps
The following commands are available to trigger optional jobs:
/test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback/test e2e-agnostic-ovn-cmd/test e2e-aws/test e2e-aws-csi/test e2e-aws-disruptive/test e2e-aws-etcd-recovery/test e2e-aws-multitenant/test e2e-aws-ovn/test e2e-aws-ovn-cgroupsv2/test e2e-aws-ovn-etcd-scaling/test e2e-aws-ovn-kubevirt/test e2e-aws-ovn-single-node/test e2e-aws-ovn-single-node-serial/test e2e-aws-ovn-single-node-upgrade/test e2e-aws-ovn-upgrade/test e2e-aws-ovn-upi/test e2e-aws-proxy/test e2e-azure/test e2e-azure-ovn-etcd-scaling/test e2e-baremetalds-kubevirt/test e2e-gcp-csi/test e2e-gcp-disruptive/test e2e-gcp-fips-serial/test e2e-gcp-ovn-etcd-scaling/test e2e-gcp-ovn-rt-upgrade/test e2e-gcp-ovn-techpreview/test e2e-gcp-ovn-techpreview-serial/test e2e-metal-ipi-ovn-dualstack/test e2e-metal-ipi-ovn-dualstack-local-gateway/test e2e-metal-ipi-sdn/test e2e-metal-ipi-serial/test e2e-metal-ipi-serial-ovn-ipv6/test e2e-metal-ipi-virtualmedia/test e2e-openstack-ovn/test e2e-openstack-serial/test e2e-vsphere/test e2e-vsphere-ovn-dualstack-primaryv6/test e2e-vsphere-ovn-etcd-scaling/test okd-e2e-gcp
Use /test all to run the following jobs that were automatically triggered:
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmdpull-ci-openshift-origin-master-e2e-aws-csipull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2pull-ci-openshift-origin-master-e2e-aws-ovn-fipspull-ci-openshift-origin-master-e2e-aws-ovn-serialpull-ci-openshift-origin-master-e2e-aws-ovn-single-nodepull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serialpull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgradepull-ci-openshift-origin-master-e2e-aws-ovn-upgradepull-ci-openshift-origin-master-e2e-gcp-csipull-ci-openshift-origin-master-e2e-gcp-ovnpull-ci-openshift-origin-master-e2e-gcp-ovn-buildspull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgradepull-ci-openshift-origin-master-e2e-gcp-ovn-upgradepull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6pull-ci-openshift-origin-master-e2e-metal-ipi-sdnpull-ci-openshift-origin-master-e2e-openstack-ovnpull-ci-openshift-origin-master-imagespull-ci-openshift-origin-master-lintpull-ci-openshift-origin-master-unitpull-ci-openshift-origin-master-verifypull-ci-openshift-origin-master-verify-deps
In response to this:
/test ?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/test e2e-aws-disruptive /test e2e-aws-etcd-recovery
Job Failure Risk Analysis for sha: 47eef202b7733ab1abca86bcd34c627243ac5373
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 | IncompleteTests Tests for this run (100) are below the historical average (1099): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems) |
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial | Medium [sig-arch] events should not repeat pathologically for ns/openshift-authentication-operator This test has passed 90.62% of 64 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. Open Bugs Auth operator capable of firing over 100 events in seconds on OpenShiftAPICheckFailed |
/retest
Job Failure Risk Analysis for sha: 87b3bada7916590c754160f39fddc4e574b2c840
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial | High [sig-api-machinery] disruption/cache-kube-api connection/reused should be available throughout the test This test has passed 100.00% of 70 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/cache-openshift-api connection/reused should be available throughout the test This test has passed 100.00% of 70 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/cache-oauth-api connection/new should be available throughout the test This test has passed 100.00% of 70 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/oauth-api connection/new should be available throughout the test This test has passed 100.00% of 70 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- Showing 4 of 12 test results |
| pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 | IncompleteTests Tests for this run (98) are below the historical average (745): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems) |
/retest
Job Failure Risk Analysis for sha: 5b1e7b863809d50a5f62cdfccaf4cadec5ff1873
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial | High [sig-api-machinery] disruption/kube-api connection/reused should be available throughout the test This test has passed 100.00% of 64 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/openshift-api connection/reused should be available throughout the test This test has passed 100.00% of 64 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/oauth-api connection/new should be available throughout the test This test has passed 100.00% of 64 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- [sig-api-machinery] disruption/cache-oauth-api connection/reused should be available throughout the test This test has passed 100.00% of 64 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. --- Showing 4 of 12 test results |
| pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6 | IncompleteTests Tests for this run (18) are below the historical average (840): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems) |
Job Failure Risk Analysis for sha: bc71c2d79959bea1ffac25f35f6b84b61dd4f794
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial | Medium [sig-arch] events should not repeat pathologically for ns/openshift-etcd This test has passed 80.85% of 47 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.16-e2e-aws-ovn-single-node-serial'] in the last 14 days. |
/test e2e-aws-etcd-recovery
Job Failure Risk Analysis for sha: eb745771fe0e97ecfc88a7d4fc16f8351132012e
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-aws-ovn-upgrade | High [sig-apps] job-upgrade This test has passed 100.00% of 272 runs on jobs ['periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade'] in the last 14 days. |
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade | High [sig-arch] events should not repeat pathologically for ns/openshift-kube-apiserver-operator This test has passed 99.63% of 3780 runs on release 4.17 [Overall] in the last week. --- [sig-arch] events should not repeat pathologically for ns/openshift-etcd-operator This test has passed 99.66% of 3780 runs on release 4.17 [Overall] in the last week. --- [bz-Node Tuning Operator] clusteroperator/node-tuning should not change condition/Available This test has passed 99.63% of 3784 runs on release 4.17 [Overall] in the last week. |
/test e2e-aws-etcd-recovery
/test e2e-aws-etcd-recovery
/test e2e-aws-etcd-recovery
/test e2e-aws-etcd-recovery
/cherry-pick release-4.16
@tjungblu: once the present PR merges, I will cherry-pick it on top of release-4.16 in a new PR and assign it to you.
In response to this:
/cherry-pick release-4.16
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/test e2e-aws-etcd-recovery
Job Failure Risk Analysis for sha: 19dbf04142214a0351460392f4941ae43dbdff30
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade | High [sig-scheduling][Early] The openshift-console console pods [apigroup:console.openshift.io] should be scheduled on different nodes [Suite:openshift/conformance/parallel] This test has passed 99.35% of 3848 runs on release 4.17 [Overall] in the last week. --- [sig-network-edge] Verify DNS availability during and after upgrade success This test has passed 99.56% of 1590 runs on release 4.17 [Overall] in the last week. --- [bz-Node Tuning Operator] clusteroperator/node-tuning should not change condition/Available This test has passed 99.56% of 3906 runs on release 4.17 [Overall] in the last week. |
| pull-ci-openshift-origin-master-e2e-aws-etcd-recovery | High [sig-arch] events should not repeat pathologically for ns/openshift-operator-lifecycle-manager This test has passed 99.66% of 3870 runs on release 4.17 [Overall] in the last week. --- [sig-arch] events should not repeat pathologically This test has passed 98.80% of 83 runs on release 4.17 [amd64 aws ha ovn] in the last week. Open Bugs lots of churn during image registry managed/removed transition Excessive TopologyAwareHintsDisabled events due to service/dns-default with topology aware hints activated. Excessive TopologyAwareHintsDisabled events due to service/dns-default with topology aware hints activated. [4.15] "k8s.ovn.org/node-chassis-id annotation not found" event causing CI failures |
/test e2e-aws-etcd-recovery
While not strictly disruptive, we can put the cert rotation tests in the recovery suite for now.
The tests themselves look good to me. 👍 for covering the dynamic cert recreation.
Only question is the test time seems surprisingly fast.
Is ~2mins all it takes for a revision rollout these days?
/approve
Holding in case @soltysh had a follow up to his earlier review.
hmm, maybe a race condition? 2m also seems too fast for me
/hold cancel
thank you both!
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: hasbro17, soltysh, tjungblu
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~test/extended/dr/OWNERS~~ [hasbro17,soltysh,tjungblu]
- ~~test/extended/etcd/OWNERS~~ [hasbro17,soltysh]
- ~~test/extended/util/annotate/generated/OWNERS~~ [hasbro17,soltysh]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/retest-required
Remaining retests: 0 against base HEAD f9a573a84d345a55791ff4630b1b2bc2a7233f15 and 2 for PR HEAD 57736c8b21544eb2426b2cc2afeda215e22cc92e in total
/retest-required
Job Failure Risk Analysis for sha: 57736c8b21544eb2426b2cc2afeda215e22cc92e
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-master-e2e-gcp-ovn | Medium [sig-storage] Multi-AZ Cluster Volumes should schedule pods in the same zones as statically provisioned PVs [Suite:openshift/conformance/parallel] [Suite:k8s] This test has passed 91.60% of 714 runs on release 4.17 [Overall] in the last week. --- [sig-storage] PersistentVolumes GCEPD [Feature:StorageProvider] should test that deleting a PVC before the pod does not cause pod deletion to fail on PD detach [Skipped:NoOptionalCapabilities] [Suite:openshift/conformance/parallel] [Suite:k8s] This test has passed 93.29% of 715 runs on release 4.17 [Overall] in the last week. --- [sig-storage] PersistentVolumes GCEPD [Feature:StorageProvider] should test that deleting the PV before the pod does not cause pod deletion to fail on PD detach [Skipped:NoOptionalCapabilities] [Suite:openshift/conformance/parallel] [Suite:k8s] This test has passed 93.42% of 714 runs on release 4.17 [Overall] in the last week. Open Bugs 4.17 ci failures: persistentvolumes "gce-" is forbidden ... GCE PD ...disk is not found |