origin icon indicating copy to clipboard operation
origin copied to clipboard

OCPBUGS-65674: VsphereConfigurationTestsRollOutTooOften event matcher should match also dep and ds events

Open RomanBednar opened this issue 4 weeks ago • 13 comments

We've observed more pathological events related to vsphere snapshot tests, this time related to Deployment and DaemonSet updates. Both are using secret hash annotation hooks, so the updates are expected when changing snapshot options, and we should filter out the related events in the event matcher.

event happened 29 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (06:48:16Z) result=reject 
event happened 21 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/5fc8006e70 - reason/DeploymentUpdated Updated Deployment.apps/vmware-vsphere-csi-driver-controller -n openshift-cluster-csi-drivers because it changed (06:47:32Z) result=reject } 

cc @openshift/storage

RomanBednar avatar Dec 01 '25 13:12 RomanBednar

Pipeline controller notification This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

openshift-ci-robot avatar Dec 01 '25 13:12 openshift-ci-robot

/payload-job periodic-ci-openshift-release-master-nightly-4.22-e2e-vsphere-ovn-serial

RomanBednar avatar Dec 01 '25 13:12 RomanBednar

@RomanBednar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.22-e2e-vsphere-ovn-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1cd81ed0-ceba-11f0-9648-ccf8f4f61ee7-0

openshift-ci[bot] avatar Dec 01 '25 13:12 openshift-ci[bot]

Scheduling required tests: /test e2e-aws-csi /test e2e-aws-ovn-fips /test e2e-aws-ovn-microshift /test e2e-aws-ovn-microshift-serial /test e2e-aws-ovn-serial-1of2 /test e2e-aws-ovn-serial-2of2 /test e2e-gcp-csi /test e2e-gcp-ovn /test e2e-gcp-ovn-upgrade /test e2e-metal-ipi-ovn-ipv6 /test e2e-vsphere-ovn /test e2e-vsphere-ovn-upi

openshift-ci-robot avatar Dec 01 '25 13:12 openshift-ci-robot

@RomanBednar: This pull request references Jira Issue OCPBUGS-65674, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

We've observed more pathological events related to vsphere snapshot tests, this time related to Deployment and DaemonSet updates. Both are using secret hash annotation hooks, so the updates are expected when changing snapshot options, and we should filter out the related events in the event matcher.

event happened 29 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (06:48:16Z) result=reject 
event happened 21 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/5fc8006e70 - reason/DeploymentUpdated Updated Deployment.apps/vmware-vsphere-csi-driver-controller -n openshift-cluster-csi-drivers because it changed (06:47:32Z) result=reject } 

cc @openshift/storage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Dec 01 '25 14:12 openshift-ci-robot

/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-vsphere-ovn-serial

RomanBednar avatar Dec 02 '25 09:12 RomanBednar

@RomanBednar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.21-e2e-vsphere-ovn-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/9f38f4a0-cf62-11f0-81c2-c8b0dd090295-0

openshift-ci[bot] avatar Dec 02 '25 09:12 openshift-ci[bot]

I can see it helped with the events.

/lgtm /approve

jsafrane avatar Dec 02 '25 10:12 jsafrane

@RomanBednar: This PR has been marked as verified by payload-job.

In response to this:

/verified by payload-job

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Dec 08 '25 09:12 openshift-ci-robot

/assign @xueqzhan

for approval

RomanBednar avatar Dec 08 '25 09:12 RomanBednar

/label acknowledge-critical-fixes-only It fixes a TRT component readiness issue.

jsafrane avatar Dec 08 '25 12:12 jsafrane

/assign @bertinatto

Pinging more people with approval rights 👍

RomanBednar avatar Dec 10 '25 15:12 RomanBednar

ping @xueqzhan

RomanBednar avatar Dec 12 '25 08:12 RomanBednar

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bertinatto, jsafrane, RomanBednar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Dec 15 '25 16:12 openshift-ci[bot]

Scheduling required tests: /test e2e-aws-csi /test e2e-aws-ovn-fips /test e2e-aws-ovn-microshift /test e2e-aws-ovn-microshift-serial /test e2e-aws-ovn-serial-1of2 /test e2e-aws-ovn-serial-2of2 /test e2e-gcp-csi /test e2e-gcp-ovn /test e2e-gcp-ovn-upgrade /test e2e-metal-ipi-ovn-ipv6 /test e2e-vsphere-ovn /test e2e-vsphere-ovn-upi

openshift-ci-robot avatar Dec 15 '25 16:12 openshift-ci-robot

/retest-required

Remaining retests: 0 against base HEAD 39f4424c7d86c5750341f54b43c1b964cf507295 and 2 for PR HEAD 47f5c994df75d2fa1cc3b9874b9c424c3eeb799b in total

openshift-ci-robot avatar Dec 15 '25 19:12 openshift-ci-robot

@RomanBednar: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Dec 15 '25 22:12 openshift-ci[bot]

@RomanBednar: Jira Issue Verification Checks: Jira Issue OCPBUGS-65674 :heavy_check_mark: This pull request was pre-merge verified. :heavy_check_mark: All associated pull requests have merged. :heavy_check_mark: All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-65674 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. :clock4:

In response to this:

We've observed more pathological events related to vsphere snapshot tests, this time related to Deployment and DaemonSet updates. Both are using secret hash annotation hooks, so the updates are expected when changing snapshot options, and we should filter out the related events in the event matcher.

event happened 29 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (06:48:16Z) result=reject 
event happened 21 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/5fc8006e70 - reason/DeploymentUpdated Updated Deployment.apps/vmware-vsphere-csi-driver-controller -n openshift-cluster-csi-drivers because it changed (06:47:32Z) result=reject } 

cc @openshift/storage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Dec 15 '25 22:12 openshift-ci-robot