kubernetes test: add feature gate consistency test for kubernetes feature

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR adds an integration test for testing feature gate compatibility for the Kubernetes Compatibility Versions KEP-4330 feature. This test parses the feature gate information from a test kube-apiserver w/ --emulation-version=<prior-version> and hits the /metrics endpoint. It compares the values it gets from this to the known correct values for that version. This test is needed as Compatibility Versions guarantees that APIs and feature gate values are correctly emulated properly when using --emulated-version=n-1..3. This test tests that core functionality of the feature.

NOTE: Currently this test fails "correctly" stating that AllowServiceLBStatusOnNonLB and ComponentSLIs are expected but missing. This test effectively caught those two feature gates being removed when they should not have been removed as they were usable in v1.31 and Compatibility Versions should allow them to be usable in 1.32, 1.33, and 1.34. This is documented more thoroughly in the issue here - https://github.com/kubernetes/kubernetes/issues/128036. As such, this PR is blocked on reverting the removal PRs for those feature gates- https://github.com/kubernetes/kubernetes/pull/126786 and https://github.com/kubernetes/kubernetes/pull/126698. Once those are reverted, this test will pass properly

Which issue(s) this PR fixes:

Fixes #127947

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-architecture/4330-compatibility-versions/README.md

Oct 16 '24 21:10 aaron-prindle

/cc @Jefftree

Oct 16 '24 21:10 aaron-prindle

/sig api-machinery

Oct 16 '24 21:10 aaron-prindle

/triage accepted

Oct 16 '24 21:10 aaron-prindle

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aaron-prindle Once this PR has been reviewed and has the lgtm label, please assign mikedanese for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

hack/OWNERS
test/integration/apiserver/OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Oct 18 '24 21:10 k8s-ci-robot

@aaron-prindle as you're looking at this code, I wonder if you'd be interested in helping produce a release artefact that lists the feature gates in a release and whether they are enabled / disabled by default.

If we had this, it would help k/website code verify that the machine readable feature gate metadata are correct - see https://github.com/kubernetes/website/tree/main/content/en/docs/reference/command-line-tools-reference/feature-gates.

Oct 19 '24 13:10 sftim

@aaron-prindle as you're looking at this code, I wonder if you'd be interested in helping produce a release artefact that lists the feature gates in a release and whether they are enabled / disabled by default.

If we had this, it would help k/website code verify that the machine readable feature gate metadata are correct - see https://github.com/kubernetes/website/tree/main/content/en/docs/reference/command-line-tools-reference/feature-gates.

@sftim - yes, I would be interested in doing this. I've created this issue - https://github.com/kubernetes/kubernetes/issues/128241 to track this workstream. Feel free to add any context/ideas in the issue there, thanks!

Oct 21 '24 18:10 aaron-prindle

Rebasing now that https://github.com/kubernetes/kubernetes/pull/128139 and https://github.com/kubernetes/kubernetes/pull/128317 have merged (related comment: https://github.com/kubernetes/kubernetes/issues/128036#issuecomment-2436522308)

Oct 24 '24 23:10 aaron-prindle

@jefftree - test is now failing with :

feature DRAControlPlaneController not found in /metrics output

Is this feature gate being removed ok w.r.t. compatibility versions? I believe this PR is related:

https://github.com/kubernetes/kubernetes/pull/128003

I can modify the test to have some allowlist if necessary

Oct 25 '24 04:10 aaron-prindle

@Jefftree - test is now failing with :
feature DRAControlPlaneController not found in /metrics output
Is this feature gate being removed ok w.r.t. compatibility versions? I believe this PR is related:

#128003

I can modify the test to have some allowlist if necessary

Yes, alpha features are permitted to be removed anytime. https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecation

Oct 28 '24 20:10 Jefftree

LGTM for integration test

Oct 31 '24 20:10 Jefftree

/cc @BenTheElder - would you be able to review the test case generation tooling in the PR here?

hack/generate-integration-compatibility-versions-test.sh
hack/tools/parse-feature-gates/parse-feature-gates.go

Oct 31 '24 21:10 aaron-prindle

Sorry I missed the previous pings!

I think this is a reasonable starting point, agree with previous review comments. cc also @aojea for visiblity.

Nov 06 '24 00:11 BenTheElder

/retest

Nov 06 '24 22:11 aaron-prindle

/retest

Nov 07 '24 18:11 aaron-prindle

@aaron-prindle: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kubernetes-e2e-gce-cos-alpha-features	f22a6ab774c6891698a02bbbb6db1a86cdf862a1	link	false	`/test pull-kubernetes-e2e-gce-cos-alpha-features`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Nov 07 '24 20:11 k8s-ci-robot

Takled offline w/ @aojea, there is consensus that this test is likely better as a prow periodic job than an integration test as this way the test can be fully automated vs require manual steps each release -> increased project friction/toil. Going to close this for now and submit a similar test to test-infra that is fully automated

Nov 07 '24 21:11 aaron-prindle

Takled offline w/ @aojea, there is consensus that this test is likely better as a prow periodic job than an integration test as this way the test can be fully automated vs require manual steps each release -> increased project friction/toil. Going to close this for now and submit a similar test to test-infra that is fully automated

sounds good. Not having manual steps each release is definitely a big plus.

Nov 07 '24 21:11 Jefftree

code in tree can not depend on manual steps or external factors to progress

Nov 09 '24 08:11 aojea

code in tree can not depend on manual steps or external factors to progress

Yeah that's fair, but I do think we can write a version of this that just requires being on a cluster with the flag set, and that seems to be more or less true in the latest version of the shell script.

It might not be appropriate as one of the existing integration tests though, maybe a new test binary or something

Feb 05 '25 22:02 BenTheElder

test: add feature gate consistency test for kubernetes feature - compatibililty versions

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: