kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

Aggregated Discovery Endpoint

Open alexzielenski opened this issue 3 years ago • 13 comments

What type of PR is this?

/kind feature

What this PR does / why we need it:

Details in KEP: https://github.com/kubernetes/enhancements/issues/3352

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Adds endpoint at /discovery/v1 which unifies all discovery information known to the cluster in one document.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/cc @apelisse /cc @seans3 /cc @kevindelgado /cc @Jefftree

/sig api-machinery /triage accepted

alexzielenski avatar Jul 25 '22 19:07 alexzielenski

/priority important-soon

alexzielenski avatar Jul 27 '22 17:07 alexzielenski

Still needs a feature flag and tests but otherwise generally ready for review

alexzielenski avatar Jul 27 '22 18:07 alexzielenski

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

k8s-triage-robot avatar Jul 27 '22 21:07 k8s-triage-robot

I think we should refactor the API to reflect the lessons learned from our first discovery document. Details like separation of GVR from GVK and the handling of subresources should be first class concepts this time. Rather than try to determine that new API in the last two days before freeze, I think we should continue shaping the API with the plan of merging it early in 1.26 instead. Given that the PR was only opened three days ago, I think the timing on this was always very tight.

@lavalamp sound reasonable?

Some thoughts on the API.

  1. Rather than rely on client knowledge of defaults for fields like GVK, I'd like to see it always specified
  2. subresource should be nested under the resource and always list GVK
  3. need some means of describing endpoints like /logs, /exec, /attach
  4. I like the idea of shaping the list like other lists and having an objectmeta.name instead of just the bare name. I'm open to making a bytes to bytes comparison in proto of the options.
  5. it seems like we may want status on these to reflect staleness.

deads2k avatar Jul 28 '22 19:07 deads2k

I think we should refactor the API to reflect the lessons learned from our first discovery document. Details like separation of GVR from GVK and the handling of subresources should be first class concepts this time. Rather than try to determine that new API in the last two days before freeze, I think we should continue shaping the API with the plan of merging it early in 1.26 instead. Given that the PR was only opened three days ago, I think the timing on this was always very tight.

Daniel Smith sound reasonable?

Some thoughts on the API.

  1. Rather than rely on client knowledge of defaults for fields like GVK, I'd like to see it always specified
  2. subresource should be nested under the resource and always list GVK
  3. need some means of describing endpoints like /logs, /exec, /attach
  4. I like the idea of shaping the list like other lists and having an objectmeta.name instead of just the bare name. I'm open to making a bytes to bytes comparison in proto of the options.
  5. it seems like we may want status on these to reflect staleness.

I think we have to consider the implications of these proposed discovery api changes to the clients. Currently, every client is dependent on the DiscoveryInterface, which is expressed in meta/v1 structures (e.g. metav1.APIResource, metav1.APIGroup, etc.). And the proposed implementation of the client for this new endpoint merely converts the new DiscoveryAPI* structs back to the API* structs to ensure no users of the discovery client have to change.

https://github.com/kubernetes/kubernetes/pull/111468

If we expose an entire new discovery api that substantially differs from the current api, then every user of the DiscoveryInterface will have to change.

The initial impetus for this discovery KEP was to fix the DiscoveryStorm problem where clients were frequently downloading dozens of discovery documents regardless of document freshness. It sounds like exposing a new discovery api would radically expand the scope of what we are trying to accomplish here. Should we consider a more modest change without a new discovery api, so it does not cascade to making all clients change?

seans3 avatar Jul 28 '22 20:07 seans3

Rather than rely on client knowledge of defaults for fields like GVK, I'd like to see it always specified

agree

subresource should be nested under the resource and always list GVK

Unsure about the former, but adding GVK sounds good.

need some means of describing endpoints like /logs, /exec, /attach

I think those are already in the open api, and we can keep this discovery doc only about things that need restmapping?

I like the idea of shaping the list like other lists and having an objectmeta.name instead of just the bare name. I'm open to making a bytes to bytes comparison in proto of the options.

If name is all you want, that sounds fine. If you want e.g. RV simulation then we would need to talk about that more.

it seems like we may want status on these to reflect staleness.

hm, adding status and metadata clauses does seem like it will make JSON much bigger, at least. But I guess so would adding GVKs everywhere. Part of the reasoning to do this is that the document is small enough that it's reasonable to not split it up on into the foreseeable future.

I'm in favor of making easy & obvious improvements, but I don't think we should make them redo the whole feature -- this was more intended to be a bug fix. We can save complicated stuff for a v3?

lavalamp avatar Jul 28 '22 21:07 lavalamp

@deads2k

I talked with the team, the intent here was actually not to change the API at all, just do the absolute minimum to get everything into a single document and make it safe given that it's coming out of a cache now. @seans3 believes that it's easy to adjust/update clients if we avoid making semantic changes.

I still advise making a v2 for mechanical reasons (to reuse conversion generators etc), but not making semantic changes. Will that work for you? (I don't expect this to make the code freeze deadline either way) The team didn't intend to sign up for semantic API changes.

I think minimizing the diff for clients is somewhat important, they have to change anyway, but it could be easier or harder to adapt depending on how different we make this document.

lavalamp avatar Jul 28 '22 23:07 lavalamp

Actually, it looks like we can re-use the metav1.APIResource type directly. That also removes any of the conversion code that currently existed.

apelisse avatar Jul 29 '22 00:07 apelisse

I talked with the team, the intent here was actually not to change the API at all,

This came up in the KEP review here: https://github.com/kubernetes/enhancements/pull/3364#discussion_r897282810 with a specific comment expanded here for convenience:

deads2k on Jun 14 I think we should make our API serialization the type we want. I can understand if internally you have a RESTMapper convert to something else to make code easier to migrate.

Once a new API is released, it will be pinned. Creating a new API with the known flaws of the previous API is a mistake. The consuming code has to adapt either way.

deads2k avatar Jul 29 '22 18:07 deads2k

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alexzielenski Once this PR has been reviewed and has the lgtm label, please assign lavalamp for approval by writing /assign @lavalamp in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 01 '22 15:08 k8s-ci-robot

Once a new API is released, it will be pinned. Creating a new API with the known flaws of the previous API is a mistake. The consuming code has to adapt either way.

An example of the modifications that would provide an API that solves these items: https://github.com/kubernetes/kubernetes/pull/111409#issuecomment-1198536637

https://github.com/kubernetes/kubernetes/pull/111792

deads2k avatar Aug 10 '22 21:08 deads2k

@alexzielenski: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-conformance-kind-ipv6-parallel fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link false /test pull-kubernetes-conformance-kind-ipv6-parallel
pull-kubernetes-e2e-gce-ubuntu-containerd fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-e2e-gce-ubuntu-containerd
pull-kubernetes-integration fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-integration
pull-kubernetes-e2e-gce-alpha-features fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link false /test pull-kubernetes-e2e-gce-alpha-features
pull-kubernetes-e2e-kind fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-e2e-kind
pull-kubernetes-unit fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-unit
pull-kubernetes-node-e2e-containerd fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-node-e2e-containerd
pull-kubernetes-e2e-gce-100-performance fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-e2e-gce-100-performance
pull-kubernetes-verify-govet-levee fe0badc7633ec89f7cd4392eca3db7ea0dcbc6aa link true /test pull-kubernetes-verify-govet-levee
pull-kubernetes-dependencies 8443b89e45b21b60108838ad6b42474cb7a16faa link true /test pull-kubernetes-dependencies

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Aug 29 '22 23:08 k8s-ci-robot

closing this in favor of splitting into smaller PR chunks

alexzielenski avatar Aug 29 '22 23:08 alexzielenski