hyperconverged-cluster-operator icon indicating copy to clipboard operation
hyperconverged-cluster-operator copied to clipboard

Add cnv_abnorml recording rule

Open avlitman opened this issue 11 months ago • 26 comments

What this PR does / why we need it: This pr adds rules that calculate issues with the pods like the memory of the pod with the highest memory exception value by container based on working set/rss memory (in bytes).

This pr is based on the rules created in kubevirt project: https://github.com/kubevirt/kubevirt/pull/11557, therefor can'r be merged before.

Reviewer Checklist

Reviewers are supposed to review the PR for every aspect below one by one. To check an item means the PR is either "OK" or "Not Applicable" in terms of that item. All items are supposed to be checked before merging a PR.

  • [ ] PR Message
  • [ ] Commit Messages
  • [ ] How to test
  • [ ] Unit Tests
  • [ ] Functional Tests
  • [ ] User Documentation
  • [ ] Developer Documentation
  • [ ] Upgrade Scenario
  • [ ] Uninstallation Scenario
  • [ ] Backward Compatibility
  • [ ] Troubleshooting Friendly

Jira Ticket:

Jira-Ticket https://issues.redhat.com/browse/CNV-39598

Release note:

Added cnv_abnormal metric to monitor potential problems

avlitman avatar Mar 23 '24 19:03 avlitman

Pull Request Test Coverage Report for Build 8832183342

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 85.837%

Totals Coverage Status
Change from base Build 8830712335: 0.0%
Covered Lines: 5200
Relevant Lines: 6058

💛 - Coveralls

coveralls avatar Mar 23 '24 19:03 coveralls

hco-e2e-operator-sdk-gcp lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws

hco-bot avatar Mar 23 '24 20:03 hco-bot

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-aws

In response to this:

hco-e2e-operator-sdk-gcp lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Mar 23 '24 20:03 kubevirt-bot

hco-e2e-upgrade-operator-sdk-sno-aws lane succeeded. /override ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure

hco-bot avatar Mar 23 '24 21:03 hco-bot

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure

In response to this:

hco-e2e-upgrade-operator-sdk-sno-aws lane succeeded. /override ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Mar 23 '24 21:03 kubevirt-bot

/hold

avlitman avatar Mar 24 '24 08:03 avlitman

/retest

avlitman avatar Apr 17 '24 08:04 avlitman

@sradco Ready for review

avlitman avatar Apr 17 '24 08:04 avlitman

/lgtm

sradco avatar Apr 21 '24 11:04 sradco

/retest

avlitman avatar Apr 21 '24 14:04 avlitman

/unhold

avlitman avatar Apr 22 '24 07:04 avlitman

hco-e2e-operator-sdk-gcp lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws

hco-bot avatar Apr 24 '24 22:04 hco-bot

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-aws

In response to this:

hco-e2e-operator-sdk-gcp lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Apr 24 '24 22:04 kubevirt-bot

hco-e2e-upgrade-prev-operator-sdk-azure lane succeeded. /override ci/prow/hco-e2e-upgrade-prev-operator-sdk-aws

hco-bot avatar Apr 24 '24 23:04 hco-bot

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-prev-operator-sdk-aws

In response to this:

hco-e2e-upgrade-prev-operator-sdk-azure lane succeeded. /override ci/prow/hco-e2e-upgrade-prev-operator-sdk-aws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Apr 24 '24 23:04 kubevirt-bot

/hold @avlitman no need for description. The bug is in the base metric. reason labels values should be memory_working_set_delta_from_request and memory_rss_delta_from_request.

sradco avatar Apr 25 '24 12:04 sradco

https://github.com/kubevirt/kubevirt/pull/11794 needs to be merged before this pr.

avlitman avatar Apr 25 '24 12:04 avlitman

/lgtm

sradco avatar Apr 25 '24 15:04 sradco

/unhold

https://github.com/kubevirt/kubevirt/pull/11794 is merged.

avlitman avatar Apr 30 '24 09:04 avlitman

/retest

avlitman avatar Apr 30 '24 09:04 avlitman

hco-e2e-upgrade-operator-sdk-sno-azure lane succeeded. /override ci/prow/hco-e2e-upgrade-operator-sdk-sno-aws hco-e2e-operator-sdk-azure lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws hco-e2e-kv-smoke-gcp lane succeeded. /override ci/prow/hco-e2e-kv-smoke-azure hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. /override ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure

hco-bot avatar Apr 30 '24 09:04 hco-bot

@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure, ci/prow/hco-e2e-kv-smoke-azure, ci/prow/hco-e2e-operator-sdk-aws, ci/prow/hco-e2e-upgrade-operator-sdk-sno-aws

In response to this:

hco-e2e-upgrade-operator-sdk-sno-azure lane succeeded. /override ci/prow/hco-e2e-upgrade-operator-sdk-sno-aws hco-e2e-operator-sdk-azure lane succeeded. /override ci/prow/hco-e2e-operator-sdk-aws hco-e2e-kv-smoke-gcp lane succeeded. /override ci/prow/hco-e2e-kv-smoke-azure hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. /override ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Apr 30 '24 09:04 kubevirt-bot

/approve

machadovilaca avatar Apr 30 '24 10:04 machadovilaca

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: machadovilaca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

kubevirt-bot avatar Apr 30 '24 10:04 kubevirt-bot

@nunnatsa @orenc1 any idea why CI is failing?

avlitman avatar May 01 '24 08:05 avlitman

/retest

avlitman avatar May 02 '24 09:05 avlitman

@avlitman: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-hco-e2e-upgrade-operator-sdk-aws 100657f501b8997a1f73fde9b503e0179adda7b0 link true /test okd-hco-e2e-upgrade-operator-sdk-aws
ci/prow/okd-hco-e2e-operator-sdk-aws 100657f501b8997a1f73fde9b503e0179adda7b0 link true /test okd-hco-e2e-operator-sdk-aws
ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure 100657f501b8997a1f73fde9b503e0179adda7b0 link true /test hco-e2e-consecutive-operator-sdk-upgrades-azure
ci/prow/okd-hco-e2e-upgrade-operator-sdk-gcp 100657f501b8997a1f73fde9b503e0179adda7b0 link true /test okd-hco-e2e-upgrade-operator-sdk-gcp
ci/prow/okd-hco-e2e-operator-sdk-gcp 100657f501b8997a1f73fde9b503e0179adda7b0 link true /test okd-hco-e2e-operator-sdk-gcp

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci[bot] avatar May 02 '24 11:05 openshift-ci[bot]

okd is broken /override ci/prow/okd-hco-e2e-operator-sdk-aws /override ci/prow/okd-hco-e2e-operator-sdk-gcp /override ci/prow/okd-hco-e2e-upgrade-operator-sdk-aws /override ci/prow/okd-hco-e2e-upgrade-operator-sdk-gcp

nunnatsa avatar May 02 '24 11:05 nunnatsa

@nunnatsa: Overrode contexts on behalf of nunnatsa: ci/prow/okd-hco-e2e-operator-sdk-aws, ci/prow/okd-hco-e2e-operator-sdk-gcp, ci/prow/okd-hco-e2e-upgrade-operator-sdk-aws, ci/prow/okd-hco-e2e-upgrade-operator-sdk-gcp

In response to this:

okd is broken /override ci/prow/okd-hco-e2e-operator-sdk-aws /override ci/prow/okd-hco-e2e-operator-sdk-gcp /override ci/prow/okd-hco-e2e-upgrade-operator-sdk-aws /override ci/prow/okd-hco-e2e-upgrade-operator-sdk-gcp

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar May 02 '24 11:05 kubevirt-bot