CNTRLPLANE-941: (monitor): ensure KAS doesn't excessively log unhandled informer errors
as this is a signal that there is something that may not be working correctly within the kube-apiserver.
We spotted this as an issue when tearing down the OpenShift OAuth stack during the rollout of an External OIDC enabled cluster. See https://issues.redhat.com/browse/OCPBUGS-45460 for more details.
This led to the discovery that creation of RBAC resources could be blocked because a KAS admission plugin relied on using an informer that would no longer work because the API it relies on is tied to the OpenShift OAuth API server, which we now disable.
This monitor test is meant to serve 2 purposes:
- Identify future occurrences of these unhandled errors as they seem harmless at first glance but can have a much deeper impact.
- Show that https://github.com/openshift/kubernetes/pull/2157 resolves the excessive logging of these unhandled errors.
@everettraven: This pull request references CNTRLPLANE-941 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.
In response to this:
as this is a signal that there is something that may not be working correctly within the kube-apiserver.
We spotted this as an issue when tearing down the OpenShift OAuth stack during the rollout of an External OIDC enabled cluster. See https://issues.redhat.com/browse/OCPBUGS-45460 for more details.
This led to the discovery that creation of RBAC resources could be blocked because a KAS admission plugin relied on using an informer that would no longer work because the API it relies on is tied to the OpenShift OAuth API server, which we now disable.
This monitor test is meant to serve 2 purposes:
- Identify future occurrences of these unhandled errors as they seem harmless at first glance but can have a much deeper impact.
- Show that https://github.com/openshift/kubernetes/pull/2157 resolves the excessive logging of these unhandled errors.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: everettraven Once this PR has been reviewed and has the lgtm label, please assign neisw for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
@everettraven: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
- periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/444a31e0-9d6b-11f0-8baa-cd0f70e1eac4-0
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
@everettraven: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
- periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/57465640-9e13-11f0-8d1f-d595b4f3ff1c-0
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
@everettraven: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
- periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ad2116b0-9eed-11f0-86f9-2a5ad673167f-0
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure-techpreview
@everettraven: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure
@everettraven: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
- periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c12ec490-9fb5-11f0-8fdc-28fa7b05405e-0
/payload-job periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure
@everettraven: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
- periodic-ci-openshift-cluster-authentication-operator-release-4.21-periodics-e2e-gcp-external-oidc-configure
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/2ead5970-9fcd-11f0-92d6-247dbefae6ac-0
Risk analysis has seen new tests most likely introduced by this PR. Please ensure that new tests meet guidelines for naming and stability.
New tests seen in this PR at sha: 472116d4573ce977542dfd978c97d7a0c8fdbf48
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer cleanup" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer collection" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer interval construction" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer preparation" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer setup" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer test evaluation" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] monitor test kas-log-analyzer writing to storage" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
- "[Monitor:kas-log-analyzer][Jira:"kube-apiserver"] should not excessively log informer reflector unhandled errors" [Total: 2, Pass: 2, Fail: 0, Flake: 0]
@everettraven: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/e2e-aws-csi | 472116d4573ce977542dfd978c97d7a0c8fdbf48 | link | true | /test e2e-aws-csi |
| ci/prow/e2e-gcp-csi | 472116d4573ce977542dfd978c97d7a0c8fdbf48 | link | true | /test e2e-gcp-csi |
| ci/prow/go-verify-deps | 472116d4573ce977542dfd978c97d7a0c8fdbf48 | link | true | /test go-verify-deps |
| ci/prow/e2e-metal-ipi-ovn-ipv6 | 472116d4573ce977542dfd978c97d7a0c8fdbf48 | link | true | /test e2e-metal-ipi-ovn-ipv6 |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.