pull-kubernetes-e2e-gce: enable KUBE_WATCHLIST_INCONSISTENCY_DETECTOR
When this environmental variable is set then the reflector performs a data consistency check.
The consistency check is meant to be enforced only in the CI, not in production. The check ensures that data retrieved by the watch-list api call is exactly the same as data received by the standard list api call.
As of today the inconsistency detector runs with pull-kubernetes-e2e-gce-cos-alpha-features where it is used only by a single test.
As we are preparing to promote the WatchList feature to beta and enable it for KCM, this job will be used to check for any inconsistencies.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: p0lyn0mial Once this PR has been reviewed and has the lgtm label, please assign cheftako for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/assign @deads2k @wojtek-t
/test pull-test-infra-verify-lint
xref: https://github.com/kubernetes/kubernetes/pull/122659
@p0lyn0mial: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-test-infra-verify-lint | fb95c271f31addab016b3e6fd389f99b4f62a89b | link | true | /test pull-test-infra-verify-lint |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
do you have PR open demonstrating how this will be used? I think that will help reviewers understand the purpose better and how the failures will be manifested to developers if/when it fails.
Playing this forward, I think when set and the client-go featuregate is enabled this would cause the .List to run both a streaming list and a regular list and compare the results?
do you have PR open demonstrating how this will be used? I think that will help reviewers understand the purpose better and how the failures will be manifested to developers if/when it fails.
As of today when this env var is enabled and the WatchList feature is enabled for the server and the ENABLE_CLIENT_GO_WATCH_LIST_ALPHA is set for a client then a reflector/informer (on client) will run both a streaming list and a regular list and compare the results. If the results are different then the reflector/informer will panic and crash the entire binary (and the entire ci job). This is similar to ENABLE_CACHE_MUTATION_DETECTOR.
Note today the entire path is already executed in https://github.com/kubernetes/kubernetes/blob/f2cfbf44b1fb482671aedbfff820ae2af256a389/test/e2e/apimachinery/watchlist.go#L39 but only for e2e-gce-cos-alpha-features job.
Note2 merging this pr won't have any effect until the WatchList feature is enabled by default (both for the server and the client). My intention is to have this PR merged so that we can start collecting data when the feature is enabled.
Note3 we should add a similar detection mechanism for the .List method once we have watchlist support for the .List method.
Can you open a k/k PR with the change you plan to merge to use this env var?
Can you open a k/k PR with the change you plan to merge to use this env var?
It will be more than one PR, the first one is at https://github.com/kubernetes/kubernetes/pull/122659 Do you want to hold back this PR until we know that we want to promote the feature to beta?