origin ETCD-639: Add E2E test to check if etcd is able to block the rollout of a revision when the quorum is not safe

This E2E tests whether etcd is able to block the rollout of a new revision when the quorum is not safe.

The etcd static pod manifest is removed by debugging into the node to bring down an etcd instance(to simulate insufficient quorum)

Jul 29 '24 15:07 jubittajohn

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Jul 29 '24 17:07 openshift-ci-robot

Job Failure Risk Analysis for sha: 2bc10d4b7c2012c2f75ccdecf761b149eddef977

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade	Medium [sig-arch][Late] operators should not create watch channels very often [apigroup:apiserver.openshift.io] [Suite:openshift/conformance/parallel] This test has passed 97.66% of 128 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.

Jul 30 '24 08:07 openshift-trt-bot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Jul 30 '24 13:07 openshift-ci-robot

/test e2e-aws-etcd-recovery

Jul 30 '24 20:07 jubittajohn

/test e2e-aws-etcd-recovery

Jul 31 '24 04:07 jubittajohn

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Jul 31 '24 05:07 openshift-ci-robot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Jul 31 '24 05:07 openshift-ci-robot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Jul 31 '24 05:07 openshift-ci-robot

/test e2e-aws-etcd-recovery

Jul 31 '24 13:07 jubittajohn

Job Failure Risk Analysis for sha: 67810e49f897f88ea383d9d265fce04ea913dd09

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	IncompleteTests Tests for this run (103) are below the historical average (480): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Jul 31 '24 16:07 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 01 '24 03:08 jubittajohn

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Aug 01 '24 03:08 openshift-ci-robot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Aug 01 '24 04:08 openshift-ci-robot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Aug 01 '24 04:08 openshift-ci-robot

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Aug 01 '24 04:08 openshift-ci-robot

/test e2e-aws-etcd-recovery

Aug 01 '24 05:08 jubittajohn

Job Failure Risk Analysis for sha: 5760eb87db226fb6be6258dad6f12d44280e559a

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-upgrade	High [sig-instrumentation] disruption/metrics-api connection/new should be available throughout the test This test has passed 100.00% of 1010 runs on jobs ['periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade'] in the last 14 days. --- [sig-instrumentation] disruption/metrics-api connection/reused should be available throughout the test This test has passed 100.00% of 1010 runs on jobs ['periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade	High [sig-arch][Late] operators should not create watch channels very often [apigroup:apiserver.openshift.io] [Suite:openshift/conformance/parallel] This test has passed 98.07% of 207 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week. --- [sig-arch] events should not repeat pathologically for ns/openshift-kube-apiserver-operator This test has passed 98.55% of 207 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.

Job Name

Failure Risk

pull-ci-openshift-origin-master-e2e-aws-ovn-upgrade

High
[sig-instrumentation] disruption/metrics-api connection/new should be available throughout the test
This test has passed 100.00% of 1010 runs on jobs ['periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade'] in the last 14 days.
---
[sig-instrumentation] disruption/metrics-api connection/reused should be available throughout the test
This test has passed 100.00% of 1010 runs on jobs ['periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade'] in the last 14 days.

pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade

High
[sig-arch][Late] operators should not create watch channels very often [apigroup:apiserver.openshift.io] [Suite:openshift/conformance/parallel]
This test has passed 98.07% of 207 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.
---
[sig-arch] events should not repeat pathologically for ns/openshift-kube-apiserver-operator
This test has passed 98.55% of 207 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:single Upgrade:micro] in the last week.

Aug 01 '24 08:08 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 01 '24 19:08 jubittajohn

/test e2e-aws-etcd-recovery

Aug 02 '24 14:08 jubittajohn

Job Failure Risk Analysis for sha: 17bf685572b7757e34d93833478e8c02accefafa

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	High [sig-node] static pods should start after being created This test has passed 99.51% of 5555 runs on release 4.17 [Overall] in the last week. Open Bugs etcd recovery test has static pod startup failure Static pod controller pods sometimes fail to start

Aug 02 '24 17:08 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 04 '24 22:08 jubittajohn

Job Failure Risk Analysis for sha: a64240a1993c5a782021d66b314bb26951a0048c

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	High [sig-node] static pods should start after being created This test has passed 99.38% of 5761 runs on release 4.17 [Overall] in the last week. Open Bugs etcd recovery test has static pod startup failure Static pod controller pods sometimes fail to start --- [sig-arch] events should not repeat pathologically This test has passed 99.14% of 116 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week.
pull-ci-openshift-origin-master-e2e-gcp-ovn-upgrade	IncompleteTests Tests for this run (20) are below the historical average (914): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade	IncompleteTests Tests for this run (20) are below the historical average (946): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-builds	IncompleteTests Tests for this run (19) are below the historical average (982): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn	IncompleteTests Tests for this run (19) are below the historical average (2221): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-csi	IncompleteTests Tests for this run (19) are below the historical average (947): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Aug 05 '24 02:08 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 05 '24 04:08 jubittajohn

Job Failure Risk Analysis for sha: 8ef655ed8c28e31c9857767eae14de7011da5adc

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	High [bz-Monitoring] clusteroperator/monitoring should not change condition/Available This test has passed 98.13% of 5516 runs on release 4.17 [Overall] in the last week. Open Bugs monitoring ClusterOperator should not blip Available=Unknown on client rate limiter --- [sig-node] static pods should start after being created This test has passed 99.37% of 5521 runs on release 4.17 [Overall] in the last week. Open Bugs etcd recovery test has static pod startup failure Static pod controller pods sometimes fail to start --- [sig-arch] events should not repeat pathologically This test has passed 99.12% of 113 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week.
pull-ci-openshift-origin-master-e2e-openstack-ovn	IncompleteTests Tests for this run (17) are below the historical average (2031): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Job Name

Failure Risk

pull-ci-openshift-origin-master-e2e-aws-etcd-recovery

High
[bz-Monitoring] clusteroperator/monitoring should not change condition/Available
This test has passed 98.13% of 5516 runs on release 4.17 [Overall] in the last week.

Open Bugs
monitoring ClusterOperator should not blip Available=Unknown on client rate limiter
---
[sig-node] static pods should start after being created
This test has passed 99.37% of 5521 runs on release 4.17 [Overall] in the last week.

Open Bugs
etcd recovery test has static pod startup failure
Static pod controller pods sometimes fail to start
---
[sig-arch] events should not repeat pathologically
This test has passed 99.12% of 113 runs on release 4.17 [Architecture:amd64 FeatureSet:default Installer:ipi Network:ovn NetworkStack:ipv4 Platform:aws SecurityMode:default Topology:ha Upgrade:none] in the last week.

pull-ci-openshift-origin-master-e2e-openstack-ovn

IncompleteTests
Tests for this run (17) are below the historical average (2031): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Aug 05 '24 08:08 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 06 '24 02:08 jubittajohn

Job Failure Risk Analysis for sha: 1545c66cd0f4526e187d40ed02a725792dfbfaba

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	High [sig-node] static pods should start after being created This test has passed 99.33% of 5793 runs on release 4.17 [Overall] in the last week. Open Bugs etcd recovery test has static pod startup failure Static pod controller pods sometimes fail to start

Aug 06 '24 06:08 openshift-trt-bot

/test e2e-aws-etcd-recovery

Aug 21 '24 20:08 jubittajohn

Job Failure Risk Analysis for sha: 47e90aeb07f9226753321fd418562292330e44e8

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-etcd-recovery	High [sig-node] static pods should start after being created This test has passed 99.49% of 4866 runs on release 4.18 [Overall] in the last week. Open Bugs Static pod controller pods sometimes fail to start
pull-ci-openshift-origin-master-e2e-aws-ovn-fips	Medium [sig-node][apigroup:config.openshift.io] CPU Partitioning node validation should have correct cpuset and cpushare set in crio containers [Suite:openshift/conformance/parallel] This test has passed 93.55% of 31 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-fips'] in the last 14 days. Open Bugs CPU partitioning node test perma-failing
pull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2	Medium [sig-node][apigroup:config.openshift.io] CPU Partitioning node validation should have correct cpuset and cpushare set in crio containers [Suite:openshift/conformance/parallel] This test has passed 91.43% of 35 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-cgroupsv2'] in the last 14 days. Open Bugs CPU partitioning node test perma-failing
pull-ci-openshift-origin-master-e2e-aws-ovn-kube-apiserver-rollout	Low [Conformance][Suite:openshift/kube-apiserver/rollout][Jira:"kube-apiserver"][sig-kube-apiserver] kube-apiserver should roll out new revisions without disruption [apigroup:config.openshift.io][apigroup:operator.openshift.io] This test has passed 50.00% of 22 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-kube-apiserver-rollout' 'periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-kube-apiserver-rollout'] in the last 14 days.

Aug 22 '24 00:08 openshift-trt-bot

/lgtm

Aug 22 '24 15:08 tjungblu

@jubittajohn: This pull request references ETCD-639 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Aug 22 '24 16:08 openshift-ci-robot