origin
origin copied to clipboard
OCPBUGS-14057: Removes HAProxyDown critical alert exception.
Removes HAProxyDown critical alert exception. Ticket: https://issues.redhat.com/browse/OCPBUGS-14057
/jira-refresh
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: miheer Once this PR has been reviewed and has the lgtm label, please assign slashpai for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@miheer: This pull request references Jira Issue OCPBUGS-14057, which is invalid:
- expected the bug to target the "4.16.0" version, but no target version was set
Comment /jira refresh
to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
The bug has been updated to refer to the pull request using the external bug tracker.
In response to this:
Removes HAProxyDown critical alert exception. Ticket: https://issues.redhat.com/browse/OCPBUGS-14057
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/jira refresh
@miheer: This pull request references Jira Issue OCPBUGS-14057, which is invalid:
- expected the bug to target the "4.16.0" version, but no target version was set
Comment /jira refresh
to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
In response to this:
/jira refresh
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@miheer: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-gcp-ovn | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | true | /test e2e-gcp-ovn |
ci/prow/e2e-aws-ovn-fips | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | true | /test e2e-aws-ovn-fips |
ci/prow/e2e-openstack-ovn | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-openstack-ovn |
ci/prow/e2e-aws-ovn-single-node | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-aws-ovn-single-node |
ci/prow/e2e-aws-ovn-serial | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | true | /test e2e-aws-ovn-serial |
ci/prow/e2e-aws-ovn-single-node-serial | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-aws-ovn-single-node-serial |
ci/prow/e2e-metal-ipi-sdn | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-metal-ipi-sdn |
ci/prow/e2e-aws-ovn-single-node-upgrade | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-aws-ovn-single-node-upgrade |
ci/prow/e2e-aws-ovn-cgroupsv2 | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-aws-ovn-cgroupsv2 |
ci/prow/e2e-agnostic-ovn-cmd | 67f5bd80550d217a11c99023ca08f5e27933e0d3 | link | false | /test e2e-agnostic-ovn-cmd |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/hold until https://github.com/openshift/runbooks/pull/166/ is merged.
/jira refresh
@Miciah: This pull request references Jira Issue OCPBUGS-14057, which is valid. The bug has been moved to the POST state.
3 validation(s) were run on this bug
- bug is open, matching expected state (open)
- bug target version (4.16.0) matches configured target version for branch (4.16.0)
- bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
Requesting review from QA contact: /cc @ShudiLi
In response to this:
/jira refresh
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/assign @Miciah /assign
tested it with 4.16.0-0.ci.test-2024-05-15-084545-ci-ln-x0xpjtt-latest, when the haproxy was down, the log could be shown in the web console 1. % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.16.0-0.ci.test-2024-05-15-084545-ci-ln-x0xpjtt-latest True False 111m Cluster version is 4.16.0-0.ci.test-2024-05-15-084545-ci-ln-x0xpjtt-latest
%oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator %oc scale --replicas 0 -n openshift-ingress-operator deployments ingress-operator
-
edit the router-default deployment to removing the livenessProbe check and startupProbe check
-
rsh to a router pod and kill the haproxy progress id % oc -n openshift-ingress get pods NAME READY STATUS RESTARTS AGE router-default-595f85875f-2j5jr 1/1 Running 0 73m router-default-595f85875f-8glrr 1/1 Running 0 73m
-
login the web console, Observer >> Alerting, can see the HAProxyDown log Name HAProxyDown
Description This alert fires when metrics report that HAProxy is down.
Summary HAProxy is down
Runbook https://github.com/openshift/runbooks/blob/master/alerts/HAProxyDown.md
Labels prometheus=openshift-monitoring/k8s severity=critical alertname=HAProxyDown pod=router-default-595f85875f-8glrr
/label qe-approved thanks
@miheer: This pull request references Jira Issue OCPBUGS-14057, which is valid.
3 validation(s) were run on this bug
- bug is open, matching expected state (open)
- bug target version (4.16.0) matches configured target version for branch (4.16.0)
- bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Requesting review from QA contact: /cc @ShudiLi
The bug has been updated to refer to the pull request using the external bug tracker.
In response to this:
Removes HAProxyDown critical alert exception. Ticket: https://issues.redhat.com/browse/OCPBUGS-14057
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale