OCPBUGS-62929: Check router RBAC before external cert ops
Added a wait step so the router service account’s RBAC settles before we create or update routes that use external certificates. The new helper impersonates the router SA and polls for get/list/watch access on the referenced secret, which eliminates the Forbidden errors that were flaking CI when the admission webhook fired during RBAC propagation
@bentito: This pull request references Jira Issue OCPBUGS-62929, which is invalid:
- expected the bug to target the "4.21.0" version, but no target version was set
Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
The bug has been updated to refer to the pull request using the external bug tracker.
In response to this:
Added a wait step so the router service account’s RBAC settles before we create or update routes that use external certificates. The new helper impersonates the router SA and polls for get/list/watch access on the referenced secret, which eliminates the Forbidden errors that were flaking CI when the admission webhook fired during RBAC propagation
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/jira refresh
@bentito: This pull request references Jira Issue OCPBUGS-62929, which is valid. The bug has been moved to the POST state.
3 validation(s) were run on this bug
- bug is open, matching expected state (open)
- bug target version (4.21.0) matches configured target version for branch (4.21.0)
- bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Requesting review from QA contact: /cc @lihongan
In response to this:
/jira refresh
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Risk analysis has seen new tests most likely introduced by this PR. Please ensure that new tests meet guidelines for naming and stability.
New Test Risks for sha: e3f4c0f9cebe669bc734b3114cbd3bb41927f5b3
| Job Name | New Test Risk |
|---|---|
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-1of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] server supports sending resources in Table format [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-1of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by metadata client's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-1of2 | Medium - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod with container resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-1of2 | High - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod, 1 container with resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, was only seen in one job, and failed 1 time(s) against the current commit. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] reflector doesn't support receiving resources as Tables [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by client-go's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by dynamic client's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should be requested by informers when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | Medium - "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should be requested by metadatainformer when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, and was only seen in one job. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | High - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod with container resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, was only seen in one job, and failed 1 time(s) against the current commit. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | High - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod, 1 container with resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, was only seen in one job, and failed 1 time(s) against the current commit. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | High - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod, no container resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, was only seen in one job, and failed 1 time(s) against the current commit. |
| pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 | High - "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod, no container resources [Suite:openshift/conformance/serial] [Suite:k8s]" is a new test, was only seen in one job, and failed 1 time(s) against the current commit. |
New tests seen in this PR at sha: e3f4c0f9cebe669bc734b3114cbd3bb41927f5b3
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] reflector doesn't support receiving resources as Tables [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] server supports sending resources in Table format [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by client-go's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by dynamic client's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should NOT be requested by metadata client's List method when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should be requested by informers when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-api-machinery] API Streaming (aka. WatchList) [FeatureGate:WatchList] [Beta] [Serial] should be requested by metadatainformer when WatchListClient is enabled [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod with container resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 0, Fail: 1, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod, 1 container with resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 0, Fail: 1, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Burstable QoS pod, no container resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 0, Fail: 1, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod with container resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod, 1 container with resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 0, Fail: 1, Flake: 0]
- "[sig-node] Pod Level Resources [Serial] [Feature:PodLevelResources] [FeatureGate:PodLevelResources] [Beta] Guaranteed QoS pod, no container resources [Suite:openshift/conformance/serial] [Suite:k8s]" [Total: 1, Pass: 0, Fail: 1, Flake: 0]
/retest
/retest
Do we need a similar wait for the tests that delete RBAC or secrets? I don't know whether those tests have been flaky, but it seems to me that we might have race conditions in those tests too.
I don't think so, here's why:
- Secret deletions already flow through
checkRouteStatus, which polls until the router reportsExternalCertificateValidationFailed, so we’re effectively waiting for the controller to observe the change (test/extended/router/external_certificate.go:239). - RBAC deletions exercised in the “routes are not reachable” path also use that same status poll, so propagation is covered there (test/extended/router/external_certificate.go:293).
- The update scenarios that expect an API call to be rejected rely on the apiserver RBAC authorizer evaluating permissions synchronously at request time. Once the role binding is deleted, the admission stack should block the request immediately.
So there’s no extra wait needed, I think, atm. NB: I also didn't hunt for related flakes though
/retest
/retest
@Miciah : The cycle before, there were 5 failing e2e but none for this flake in question, and currently we have 1 failing e2e and not b/c of this flake. Can you take another review pass?
/assign @Miciah
/assign @rfredette
/retest
@bentito: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/e2e-metal-ipi-ovn-ipv6 | bf0e17cfb1503ef4adc30cb8adb420cae6a51fc0 | link | true | /test e2e-metal-ipi-ovn-ipv6 |
| ci/prow/okd-scos-e2e-aws-ovn | bf0e17cfb1503ef4adc30cb8adb420cae6a51fc0 | link | false | /test okd-scos-e2e-aws-ovn |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
Job Failure Risk Analysis for sha: bf0e17cfb1503ef4adc30cb8adb420cae6a51fc0
| Job Name | Failure Risk |
|---|---|
| pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6 | IncompleteTests Tests for this run (22) are below the historical average (2160): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems) |
Thanks!
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: bentito, Miciah
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~test/extended/router/OWNERS~~ [Miciah]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/jira refresh
The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.
@openshift-bot: This pull request references Jira Issue OCPBUGS-62929, which is invalid:
- expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "4.21.0" instead
Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
In response to this:
/jira refresh
The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.