origin icon indicating copy to clipboard operation
origin copied to clipboard

NO-JIRA: watchpods: fix the collection logic for pending pods

Open rphillips opened this issue 1 year ago • 21 comments

periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le is exhibiting a flake with pods transitioning to Pending state. I reviewed the kubelet, scheduler, and test logs, and everything seems to make sense.

The test itself though does not check to see if the 'Old' pod was in Pending state to check for the invalid state transition. The PR changes the test so an Old Pod Phase of Pending and a New Pod Phase of Pending are not considered bad transitions.

rphillips avatar Jan 24 '24 20:01 rphillips

/lgtm /approve

deads2k avatar Jan 24 '24 20:01 deads2k

@rphillips: This pull request explicitly references no jira issue.

In response to this:

periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le is exhibiting a flake with pods transitioning to Pending state. I reviewed the kubelet, scheduler, and test logs, and everything seems to make sense.

The test itself though does not check to see if the 'Old' pod was in Pending state to check for the invalid state transition. The PR changes the test so an Old Pod Phase of Pending and a New Pod Phase of Pending are not considered bad transitions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Jan 24 '24 20:01 openshift-ci-robot

/cherrypick release-4.15

deads2k avatar Jan 24 '24 20:01 deads2k

@deads2k: once the present PR merges, I will cherry-pick it on top of release-4.15 in a new PR and assign it to you.

In response to this:

/cherrypick release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: deads2k, rphillips

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Jan 24 '24 20:01 openshift-ci[bot]

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: deads2k, rphillips

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Jan 24 '24 20:01 openshift-ci[bot]

/pj-rehearse help

harche avatar Jan 24 '24 20:01 harche

/payload help

harche avatar Jan 24 '24 20:01 harche

/payload-job periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

harche avatar Jan 24 '24 20:01 harche

@: trigger 1 job(s) for the /payload-(job|aggregate|job-with-prs) command

  • periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/68d9bd20-baf8-11ee-9816-189d052c08b4-0

openshift-ci[bot] avatar Jan 24 '24 20:01 openshift-ci[bot]

/hold until we get the results from https://github.com/openshift/origin/pull/28548#issuecomment-1908878339

harche avatar Jan 24 '24 20:01 harche

New changes are detected. LGTM label has been removed.

openshift-ci[bot] avatar Jan 25 '24 01:01 openshift-ci[bot]

/payload-job periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

rphillips avatar Jan 25 '24 01:01 rphillips

@: trigger 1 job(s) for the /payload-(job|aggregate|job-with-prs) command

  • periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/31e43110-bb20-11ee-9512-7494d5028ea1-0

openshift-ci[bot] avatar Jan 25 '24 01:01 openshift-ci[bot]

/retest

rphillips avatar Jan 25 '24 02:01 rphillips

/payload-job periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

rphillips avatar Jan 25 '24 03:01 rphillips

@: trigger 1 job(s) for the /payload-(job|aggregate|job-with-prs) command

  • periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f26b59d0-bb30-11ee-97b8-0749aba32382-0

openshift-ci[bot] avatar Jan 25 '24 03:01 openshift-ci[bot]

/payload-job periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

rphillips avatar Jan 25 '24 19:01 rphillips

@rphillips: trigger 1 job(s) for the /payload-(job|aggregate|job-with-prs) command

  • periodic-ci-openshift-multiarch-master-nightly-4.16-ocp-e2e-ovn-remote-libvirt-ppc64le

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/937953d0-bbb9-11ee-9e15-2c74cc70a484-0

openshift-ci[bot] avatar Jan 25 '24 19:01 openshift-ci[bot]

@rphillips: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn-upgrade acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link true /test e2e-gcp-ovn-upgrade
ci/prow/e2e-aws-ovn-single-node-serial acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link false /test e2e-aws-ovn-single-node-serial
ci/prow/e2e-aws-ovn-upgrade acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link false /test e2e-aws-ovn-upgrade
ci/prow/e2e-gcp-ovn acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link true /test e2e-gcp-ovn
ci/prow/e2e-gcp-csi acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link false /test e2e-gcp-csi
ci/prow/verify acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link true /test verify
ci/prow/e2e-gcp-ovn-rt-upgrade acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link false /test e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-aws-ovn-single-node-upgrade acfdaeb1ce54af5310f41018582b1e9d0f0619e8 link false /test e2e-aws-ovn-single-node-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Jan 26 '24 00:01 openshift-ci[bot]

Job Failure Risk Analysis for sha: acfdaeb1ce54af5310f41018582b1e9d0f0619e8

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade IncompleteTests
Tests for this run (419) are below the historical average (2017): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

openshift-trt-bot avatar Mar 05 '24 16:03 openshift-trt-bot

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot avatar Jun 04 '24 01:06 openshift-bot

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot avatar Jul 04 '24 08:07 openshift-bot