controller-tools icon indicating copy to clipboard operation
controller-tools copied to clipboard

Generated OpenAPI schema for metav1.LabelSelector accepts invalid input

Open MikeSpreitzer opened this issue 10 months ago • 7 comments

When I use controller-gen to generate a CustomResourceDefinition from my commented Go definition of an object type that uses metav1.LabelSelector, the OpenAPI schema for the LabelSelector only says that the label names and values must be strings. But the strings that are ultimately allowed in those positions are rather restricted (see https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set).

MikeSpreitzer avatar Apr 12 '24 04:04 MikeSpreitzer

While I agree with your assessment and that this could be better, I think this is something that, as a community, we probably don't want to fix yet.

Making a change to the way that the schema is generated now would create a breaking change in any consumers API that uses the label selector today and has already shipped.

Until we can fix this, you should be able to use a CEL validation for any label selector you want to add as a new field and then validate the fields within it using a regex (or combination of regexes) that enforces the correct format.

In the future, we may be able to fix this at the controller tools layer. Kube is introducing ratcheting validation and, once that is present and stable, we may be able to find a way to make this a backwards compatible change.

This will need to be done carefully though so that all supported versions of Kube at the time support the ratcheting validation, else we will still run the risk of users generating breaking changes in their APIs.

I think sticking a pin in this for a couple of years might be the best course of action for now.

JoelSpeed avatar Apr 12 '24 05:04 JoelSpeed

Agree!

This will need to be done carefully though so that all supported versions of Kube at the time support the ratcheting validation, else we will still run the risk of users generating breaking changes in their APIs.

Wondering if this is enough to be honest. We know that a lot of people are running unsupported versions and I don't know if we want to break them via controller-gen. So maybe we want to provide a flag to opt-out then?

sbueringer avatar Apr 12 '24 05:04 sbueringer

Perhaps a "minimum compatible version" flag of some description that triggers features/generation changes like this.

It would be up to a project to decide their minimum level and generate their CRDs against that version.

That said, do we not declare a compatibility matrix between controller tools versions and the KAS that the CRD is to be installed on? How did we handle the v1beta1 to v1 API extensions group change for example?

JoelSpeed avatar Apr 12 '24 10:04 JoelSpeed

I do not understand the concern. The restrictions on label names and values have been in place for a long time. I do not understand how any shipped products can be working with invalid labels.

MikeSpreitzer avatar Apr 12 '24 15:04 MikeSpreitzer

Restricting validation at admission time is a breaking change no matter what happens on the backend.

While yes, your use case of a label selector means that parsing it results in an error, you could in theory parse it in another way that means that the valid values are different (see upstream API guidelines about not re-using other people's types).

You also have to bear in mind that changing the validation will immediately break all writes to a currently invalid object. Which means that not only will users not be able to update the object unless they fix the error, but even status updates from controllers will fail. Outwardly, this can have very bad consequences such as "Oh the object actually looks good" when in fact it is very not good.

When ratcheting validation comes in, that latter problem is resolved. Only writes to the broken field will fail, writes to the rest of the object would succeed. This means that controllers can continue to operate, even if the spec is invalidated because of an updated validation

JoelSpeed avatar Apr 12 '24 16:04 JoelSpeed

I see, thanks for the explanation.

It seems backwards to hurt developers who are trying to use LabelSelectors exactly as they are designed and intended, for the sake of old stuff that abuses them. Could there be a way to tell controller-gen "hey, I am using labels and label selectors in the normal way and want full validation"?

MikeSpreitzer avatar Apr 12 '24 19:04 MikeSpreitzer

I think it comes down to doing the right thing per default and allowing a way to opt-out with a flag or something.

sbueringer avatar Apr 15 '24 06:04 sbueringer

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 14 '24 06:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 13 '24 07:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Sep 12 '24 07:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 12 '24 07:09 k8s-ci-robot