OCPBUGS-55238: spyglass: hide disruption events for localhost
Don't display localhost-related disruptions on spyglass. These are still displayed on non-spyglass reports in case unexpected localhost disruption happens
@vrutkovs: This pull request references Jira Issue OCPBUGS-55238, which is invalid:
- expected the bug to target the "4.19.0" version, but no target version was set
Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
The bug has been updated to refer to the pull request using the external bug tracker.
In response to this:
Don't display localhost-related disruptions on spyglass. These are still displayed on non-spyglass reports in case unexpected localhost disruption happens
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/jira refresh
@vrutkovs: This pull request references Jira Issue OCPBUGS-55238, which is valid. The bug has been moved to the POST state.
3 validation(s) were run on this bug
- bug is open, matching expected state (open)
- bug target version (4.19.0) matches configured target version for branch (4.19.0)
- bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Requesting review from QA contact: /cc @wangke19
In response to this:
/jira refresh
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Risk analysis has seen new tests most likely introduced by this PR. Please ensure that new tests meet guidelines for naming and stability.
New Test Risks for sha: c900caa870cc47f362d593748254b99121307085
| Job Name | New Test Risk |
|---|---|
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial | Medium - "Find all of the input images from ocp/4.20 and tag them into the stable stream" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial | Medium - "Find all of the input images from ocp/4.20 and tag them into the stable-initial stream" is a new test, and was only seen in one job. |
New tests seen in this PR at sha: c900caa870cc47f362d593748254b99121307085
- "Find all of the input images from ocp/4.20 and tag them into the stable stream" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "Find all of the input images from ocp/4.20 and tag them into the stable-initial stream" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
The problem with leaving expected disruption in and hiding it in the UI is the larger system used to monitor disruption data, all of which needs the same accommodations otherwise it flags localhost disruption as disruption and starts monitoring for changes. This would include the grafana dashboard, the alerts in dpcr cluster, and the metrics published by sippy for those alerts, as well as scheduled queries in bigquery used for the reporting.
Do you intend to have this monitored for changes in disruption and pursue fixes for those issues?
If so then maybe we leave it in. (but we wouldn't to hide it on interval charts)
If not, these intervals really should be classified with a different source. That would immediately remove them from the analysis framework, and they would not appear in this chart.
Also remember the new intervals UI under debug tools is at https://github.com/openshift/sippy/blob/main/sippy-ng/src/prow_job_runs/IntervalsChart.js and it is largely based on categorizing by Source.
Do you intend to have this monitored for changes in disruption and pursue fixes for those issues?
Localhost disruptions are expected when pod restarts (on rollout), but may be misleading - in most cases they are expected to happen.
If so then maybe we leave it in. (but we wouldn't to hide it on interval charts)
We're hiding them on the main chart, but leaving on non-spyglass charts for completeness.
If not, these intervals really should be classified with a different source. That would immediately remove them from the analysis framework, and they would not appear in this chart.
I don't think these are being sent for analysis anyway
They have been spamming #trt-alerts for weeks now, up to and including today, they are definitely going into the analysis system.
Can you skip generating the intervals when it's expected?
I think it's easier to move them to a different source
This looks great, thank you, just waiting to see the resulting files.
https://sippy.dptools.openshift.org/sippy-ng/job_runs/1947614946313375744/pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade/openshift_origin/29710/intervals?end=2025-07-22T13%3A43%3A07Z&filterText=&intervalFile=e2e-timelines_spyglass_20250722-123441.json&overrideDisplayFlag=0&selectedSources=OperatorAvailable&selectedSources=OperatorProgressing&selectedSources=OperatorDegraded&selectedSources=KubeletLog&selectedSources=EtcdLog&selectedSources=EtcdLeadership&selectedSources=Alert&selectedSources=Disruption&selectedSources=E2EFailed&selectedSources=APIServerGracefulShutdown&selectedSources=KubeEvent&selectedSources=NodeState&selectedSources=DisruptionLocalhost&start=2025-07-22T11%3A56%3A40Z
Looks good to me.
/lgtm /hold
Release when you're happy with the results.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: dgoodwin, vrutkovs
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [dgoodwin]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/hold cancel
Yup, looks good
/retest-required
Remaining retests: 0 against base HEAD af0e85d21e1c8f02c7c0272b4a7e6f0d6f9db314 and 2 for PR HEAD da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 in total
/retest-required
Remaining retests: 0 against base HEAD 47eed7a6649663d0685122c61e44f7a0a63049b0 and 1 for PR HEAD da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 in total
/retest-required
Remaining retests: 0 against base HEAD b392d63f16d05e3a6d8e4673a67d362ccc0f6de3 and 2 for PR HEAD da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 in total
@vrutkovs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback | c900caa870cc47f362d593748254b99121307085 | link | false | /test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback |
| ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw-techpreview | c900caa870cc47f362d593748254b99121307085 | link | false | /test e2e-metal-ipi-ovn-dualstack-bgp-local-gw-techpreview |
| ci/prow/okd-e2e-gcp | c900caa870cc47f362d593748254b99121307085 | link | false | /test okd-e2e-gcp |
| ci/prow/e2e-gcp-fips-serial | c900caa870cc47f362d593748254b99121307085 | link | false | /test e2e-gcp-fips-serial |
| ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-techpreview | c900caa870cc47f362d593748254b99121307085 | link | false | /test e2e-metal-ipi-ovn-dualstack-bgp-techpreview |
| ci/prow/e2e-metal-ipi-serial | c900caa870cc47f362d593748254b99121307085 | link | false | /test e2e-metal-ipi-serial |
| ci/prow/e2e-metal-ipi-serial-ovn-ipv6 | c900caa870cc47f362d593748254b99121307085 | link | false | /test e2e-metal-ipi-serial-ovn-ipv6 |
| ci/prow/e2e-aws-ovn-serial | c900caa870cc47f362d593748254b99121307085 | link | true | /test e2e-aws-ovn-serial |
| ci/prow/e2e-aws-ovn-serial-publicnet | c900caa870cc47f362d593748254b99121307085 | link | true | /test e2e-aws-ovn-serial-publicnet |
| ci/prow/e2e-aws-ovn-kube-apiserver-rollout | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-ovn-kube-apiserver-rollout |
| ci/prow/e2e-gcp-ovn-rt-upgrade | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-ovn-rt-upgrade |
| ci/prow/e2e-aws-ovn-etcd-scaling | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-ovn-etcd-scaling |
| ci/prow/okd-scos-e2e-aws-ovn | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test okd-scos-e2e-aws-ovn |
| ci/prow/e2e-gcp-disruptive | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-disruptive |
| ci/prow/e2e-gcp-fips-serial-2of2 | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-fips-serial-2of2 |
| ci/prow/e2e-azure-ovn-etcd-scaling | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-azure-ovn-etcd-scaling |
| ci/prow/e2e-openstack-serial | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-openstack-serial |
| ci/prow/e2e-azure-ovn-upgrade | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-azure-ovn-upgrade |
| ci/prow/e2e-gcp-ovn-techpreview | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-ovn-techpreview |
| ci/prow/e2e-openstack-ovn | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-openstack-ovn |
| ci/prow/e2e-aws-disruptive | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-disruptive |
| ci/prow/e2e-aws-ovn-microshift-serial | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-ovn-microshift-serial |
| ci/prow/e2e-gcp-ovn-etcd-scaling | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-ovn-etcd-scaling |
| ci/prow/e2e-aws-ovn-microshift | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-ovn-microshift |
| ci/prow/e2e-gcp-fips-serial-1of2 | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-fips-serial-1of2 |
| ci/prow/e2e-gcp-ovn-techpreview-serial-2of2 | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-gcp-ovn-techpreview-serial-2of2 |
| ci/prow/e2e-aws-ovn-single-node-upgrade | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-aws-ovn-single-node-upgrade |
| ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-vsphere-ovn-dualstack-primaryv6 |
| ci/prow/e2e-vsphere-ovn-etcd-scaling | da1a05cc62aef50f4ac6e3cd1c6a632c589628e5 | link | false | /test e2e-vsphere-ovn-etcd-scaling |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
Job Failure Risk Analysis for sha: da1a05cc62aef50f4ac6e3cd1c6a632c589628e5
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-main-e2e-aws-disruptive | IncompleteTests Tests for this run (106) are below the historical average (341): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems) |
| pull-ci-openshift-origin-main-e2e-gcp-ovn-etcd-scaling | Low [bz-etcd][invariant] alert/etcdMembersDown should not be at or above info This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:gcp SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time --- [bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded This test has passed 0.00% of 1 runs on release 4.20 [Architecture:amd64 FeatureSet:default Installer:ipi JobTier:rare Network:ovn NetworkStack:ipv4 Owner:eng Platform:gcp SecurityMode:default Topology:ha Upgrade:none] in the last week. Open Bugs etcd-scaling jobs failing ~60% of the time |
@vrutkovs: Jira Issue OCPBUGS-55238: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-55238 has been moved to the MODIFIED state.
In response to this:
Don't display localhost-related disruptions on spyglass. These are still displayed on non-spyglass reports in case unexpected localhost disruption happens
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
[ART PR BUILD NOTIFIER]
Distgit: openshift-enterprise-tests This PR has been included in build openshift-enterprise-tests-container-v4.20.0-202507231546.p0.g848143e.assembly.stream.el9. All builds following this will include this PR.
/cherry-pick release-4.19
@wangke19: new pull request created: #30023
In response to this:
/cherry-pick release-4.19
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.