configuration-anomaly-detection
configuration-anomaly-detection copied to clipboard
OSD-30030: E2E Test cases ClusterMonitoringErrorBudgetBurnSRE
E2E Test: ClusterMonitoringErrorBudgetBurn Alert Trigger and Recovery (OSD-30030)
Description: This PR adds an E2E test for the ClusterMonitoringErrorBudgetBurn alert, targeting AWS CCS clusters.
The test misconfigures the user-workload-monitoring-config ConfigMap in the openshift-user-workload-monitoring namespace to simulate excessive monitoring error budget burn. It then checks if a service log is created and finally deletes the ConfigMap to clean up the test state.
Steps: Fetch initial cluster info and current service logs.
Backup the original ConfigMap.
Inject malformed YAML to trigger the alert.
Wait for CAD/system reaction.
Validate that a new service log is generated.
Delete the ConfigMap as a recovery step.
Acceptance: Alert is triggered (ClusterMonitoringErrorBudgetBurnSRE).
Service log is sent to the customer.
Cluster state is restored by deleting the ConfigMap.
Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all
@lambasanchit: This pull request references OSD-30030 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.
In response to this:
E2E Test: ClusterMonitoringErrorBudgetBurn Alert Trigger and Recovery (OSD-30030)
Description: This PR adds an E2E test for the ClusterMonitoringErrorBudgetBurn alert, targeting AWS CCS clusters.
The test misconfigures the user-workload-monitoring-config ConfigMap in the openshift-user-workload-monitoring namespace to simulate excessive monitoring error budget burn. It then checks if a service log is created and finally deletes the ConfigMap to clean up the test state.
Steps: Fetch initial cluster info and current service logs.
Backup the original ConfigMap.
Inject malformed YAML to trigger the alert.
Wait for CAD/system reaction.
Validate that a new service log is generated.
Delete the ConfigMap as a recovery step.
Acceptance: Alert is triggered (ClusterMonitoringErrorBudgetBurnSRE).
Service log is sent to the customer.
Cluster state is restored by deleting the ConfigMap.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 31.92%. Comparing base (
a59dd9a) to head (1eb302c). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #456 +/- ##
=======================================
Coverage 31.92% 31.92%
=======================================
Files 36 36
Lines 2487 2487
=======================================
Hits 794 794
Misses 1632 1632
Partials 61 61
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
/retest
/lgtm
/label tide/merge-method-squash
@lambasanchit: all tests passed!
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: bergmannf, lambasanchit
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [bergmannf]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment