hyperconverged-cluster-operator icon indicating copy to clipboard operation
hyperconverged-cluster-operator copied to clipboard

pkg/monitoring/metrics: add new alert for vms using outdated machine type

Open dasionov opened this issue 1 year ago • 15 comments

What this PR does / why we need it:

This PR introduces a new Prometheus alert for virtual machines (VMs) in the cluster that require a machine type update. For instance, VMs configured with the machine type set to RHEL 8 would be flagged as outdated. This is because the base image of the virt-launcher will transition to RHEL 10, which introduces breaking changes incompatible with older machine types.

Depends-On #3132, https://github.com/kubevirt/kubevirt/pull/13010

Reviewer Checklist

  • [x] PR Message
  • [x] Commit Messages
  • [ ] How to test
  • [ ] Unit Tests
  • [x] Functional Tests
  • [ ] User Documentation
  • [ ] Developer Documentation
  • [ ] Upgrade Scenario
  • [ ] Uninstallation Scenario
  • [ ] Backward Compatibility
  • [ ] Troubleshooting Friendly

Release note:

None

dasionov avatar Sep 25 '24 20:09 dasionov

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign sradco for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

kubevirt-bot avatar Sep 25 '24 20:09 kubevirt-bot

Is HCO the right place for th his metric? HCO does not know VMs at all and dhould not monitor them.

/hold

nunnatsa avatar Sep 25 '24 20:09 nunnatsa

Pull Request Test Coverage Report for Build 11310572661

Details

  • 10 of 40 (25.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.2%) to 72.006%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/monitoring/rules/alerts/operator_alerts.go 10 40 25.0%
<!-- Total: 10 40
Totals Coverage Status
Change from base Build 11252860062: -0.2%
Covered Lines: 5970
Relevant Lines: 8291

💛 - Coveralls

coveralls avatar Sep 27 '24 01:09 coveralls

/test ci/prow/hco-e2e-operator-sdk-azure

dasionov avatar Sep 30 '24 22:09 dasionov

@dasionov: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

  • /test build-hco-test-utils-image
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.31

Use /test all to run the following jobs that were automatically triggered:

  • pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • pull-hyperconverged-cluster-operator-e2e-k8s-1.31

In response to this:

/test ci/prow/hco-e2e-operator-sdk-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Sep 30 '24 22:09 kubevirt-bot

@dasionov: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

  • /test ci-index-hco-bundle
  • /test ci-index-hco-upgrade-operator-sdk-bundle
  • /test hco-e2e-consecutive-operator-sdk-upgrades-aws
  • /test hco-e2e-consecutive-operator-sdk-upgrades-azure
  • /test hco-e2e-kv-smoke-azure
  • /test hco-e2e-kv-smoke-gcp
  • /test hco-e2e-operator-sdk-aws
  • /test hco-e2e-operator-sdk-azure
  • /test hco-e2e-operator-sdk-gcp
  • /test hco-e2e-upgrade-operator-sdk-aws
  • /test hco-e2e-upgrade-operator-sdk-azure
  • /test hco-e2e-upgrade-prev-operator-sdk-aws
  • /test hco-e2e-upgrade-prev-operator-sdk-azure
  • /test images
  • /test okd-ci-index-hco-bundle
  • /test okd-ci-index-hco-upgrade-operator-sdk-bundle
  • /test okd-images

The following commands are available to trigger optional jobs:

  • /test hco-e2e-operator-sdk-sno-aws
  • /test hco-e2e-operator-sdk-sno-azure
  • /test hco-e2e-upgrade-operator-sdk-sno-aws
  • /test hco-e2e-upgrade-operator-sdk-sno-azure
  • /test hco-e2e-upgrade-prev-operator-sdk-sno-aws
  • /test hco-e2e-upgrade-prev-operator-sdk-sno-azure
  • /test okd-hco-e2e-operator-sdk-aws
  • /test okd-hco-e2e-operator-sdk-gcp
  • /test okd-hco-e2e-upgrade-operator-sdk-aws
  • /test okd-hco-e2e-upgrade-operator-sdk-gcp

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-ci-index-hco-bundle
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-ci-index-hco-upgrade-operator-sdk-bundle
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-consecutive-operator-sdk-upgrades-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-consecutive-operator-sdk-upgrades-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-kv-smoke-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-kv-smoke-gcp
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-operator-sdk-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-operator-sdk-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-operator-sdk-gcp
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-operator-sdk-sno-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-operator-sdk-sno-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-operator-sdk-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-operator-sdk-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-operator-sdk-sno-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-operator-sdk-sno-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-prev-operator-sdk-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-prev-operator-sdk-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-prev-operator-sdk-sno-aws
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-hco-e2e-upgrade-prev-operator-sdk-sno-azure
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-images
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-okd-ci-index-hco-bundle
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-okd-ci-index-hco-upgrade-operator-sdk-bundle
  • pull-ci-kubevirt-hyperconverged-cluster-operator-main-okd-images

In response to this:

/test ci/prow/hco-e2e-operator-sdk-azure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci[bot] avatar Sep 30 '24 22:09 openshift-ci[bot]

/test hco-e2e-operator-sdk-aws

dasionov avatar Sep 30 '24 22:09 dasionov

@dasionov: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

  • /test build-hco-test-utils-image
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • /test pull-hyperconverged-cluster-operator-e2e-k8s-1.31

Use /test all to run the following jobs that were automatically triggered:

  • pull-hyperconverged-cluster-operator-e2e-k8s-1.30
  • pull-hyperconverged-cluster-operator-e2e-k8s-1.31

In response to this:

/test hco-e2e-operator-sdk-aws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Sep 30 '24 22:09 kubevirt-bot

/cc @machadovilaca

dasionov avatar Oct 01 '24 11:10 dasionov

/cc @enp0s3

dasionov avatar Oct 09 '24 13:10 dasionov

/retest-required

dasionov avatar Oct 10 '24 19:10 dasionov

/retest

dasionov avatar Oct 10 '24 21:10 dasionov

@dasionov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-aws 037c42287b7c8bf6f232f9cb9a37b32aaee26b27 link true /test hco-e2e-consecutive-operator-sdk-upgrades-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Oct 13 '24 03:10 openshift-ci[bot]

Is HCO the right place for th his metric? HCO does not know VMs at all and dhould not monitor them.

/hold

and also I have some concerns if it makes sense to check for RHEL versions here

machadovilaca avatar Oct 15 '24 11:10 machadovilaca

/close

dasionov avatar Nov 12 '24 14:11 dasionov

@dasionov: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubevirt-bot avatar Nov 12 '24 14:11 kubevirt-bot