test-infra icon indicating copy to clipboard operation
test-infra copied to clipboard

ProwJob validation fails when trying to apply `tekton_pipeline_run_spec` with inline parameters

Open Ressetkk opened this issue 2 years ago • 14 comments

/area prow /sig testing

When trying to apply ProwJob generated by mkpj, the server validation refuses to accept the resource with the following error message:

The ProwJob "ddaafcf6-a7b0-11ed-a4c2-e6e90e87bdcb" is invalid: spec.tekton_pipeline_run_spec.v1beta1.params[0].value: Invalid value: "string": spec.tekton_pipeline_run_spec.v1beta1.params[0].value in body must be of type object: "string"

This is the inline definition of tekton_pipeline_run_spec, as followed and supported by Prow after merging support for v1beta1 API. I made a comparison between API definitions in v0.36.0 and v0.44.0 and there haven't been any changes in the API between releases. That means ultimately Prow's validation has some problems here.

apiVersion: prow.k8s.io/v1
kind: ProwJob
metadata:
  annotations:
    prow.k8s.io/context: ""
    prow.k8s.io/job: ci-tekton-run
  creationTimestamp: null
  labels:
    created-by-prow: "true"
    prow.k8s.io/context: ""
    prow.k8s.io/job: ci-tekton-run
    prow.k8s.io/type: periodic
  name: 10d2a87a-a7b3-11ed-ae2f-e6e90e87bdcb
spec:
  agent: tekton-pipeline
  cluster: kind-kind
  job: ci-tekton-run
  namespace: default
  prowjob_defaults:
    tenant_id: GlobalDefaultID
  report: true
  tekton_pipeline_run_spec:
    v1beta1:
      params:
      - name: variable
        value: Pipelinezz
      pipelineSpec:
        description: Hello world!
        tasks:
        - name: hello
          taskSpec:
            metadata: {}
            spec: null
            steps:
            - image: alpine:edge
              name: echo
              resources: {}
              script: |
                #!/bin/sh
                echo "Hello World"
        - name: bye
          taskSpec:
            metadata: {}
            spec: null
            steps:
            - image: alpine:edge
              name: echo
              resources: {}
              script: |-
                #!/bin/sh
                echo "Bye! $(params.variable)"
  type: periodic
status:
  startTime: "2023-02-08T13:18:24Z"
  state: triggered

In usual PipelineRuns CR the params.value can be either string, or object.

Looking at the definition of PipelineRun in Tekton I noticed something interesting. The way how Tekton defines the behavior of managing CRD is by setting up x-kubernetes-preserve-unknown-fields: true in openAPIV3schema. Going back to Prow's CRD I see that the pipelineRun definition is explicitly defined for this spec. That means Kubernetes rejects the request to create such ProwJob, because params indeed doesn't follow Tekton's way of managing CRDs.

Another thing to mention, when I define the value parameter as object it gets validated properly, but the result of said PipelineRun is not handled by Tekton itself, as it tries to parse object type as string.

        value:
          type: string
          stringVal: Pipelinezz
          arrayVal: []
          objectVal: {}

Tested infrastructure

  • GKE 1.25
  • Prow latest version
  • Tekton Pipelines v0.37.2 and v0.44.0 (Current)

Can we do something about it? Would it be possible to handle Tekton's base the same way in Prow as it's handled natively?

Ressetkk avatar Feb 08 '23 14:02 Ressetkk

I did some investigation. I manually removed entire tekton_pipeline_run_spec.v1beta1 properties and added x-kubernetes-preserve-unknown-fields: true. ProwJob got applied and pipeline worked as expected.

The question is, can we somehow tell the generator to use this header, than validating this API?

Ressetkk avatar Feb 09 '23 10:02 Ressetkk

What's difference between inline and the prow manifest 🤔 I usually write my jobs like this and it works as expected with mkpj as well

presubmits:
  OWNER/REPO:
  - name: hello-goodbye
    agent: tekton-pipeline # Use the Prow Tekton controller!
    clone_uri: "[email protected]:OWNER/REPO.git"
    always_run: true
    skip_report: false
    rerun_command: /test hello-goodbye
    trigger: "(?m)^/test (all|hello-goodbye),?(\\s+|$)"
    pipeline_run_spec:
      serviceAccountName: prow-pipeline
      pipelineRef:
        name: hello-goodbye
      resources:
      - name: source
        resourceRef:
          name: PROW_IMPLICIT_GIT_REF
      workspaces:
      - name: workspace
        emptydir: {}

qaifshaikh avatar Feb 09 '23 22:02 qaifshaikh

@qaifshaikh basically, inline spec is a Pipeline definition that you can embed inside a PipelineRun resource under pipelineSpec field. The option that you provided will work well, but if you wanted to define your custom params value, just like in the PipelineRun resource, you would get a validation error mentioned before.

This should return validation error.

presubmits:
  OWNER/REPO:
  - name: hello-goodbye
    agent: tekton-pipeline # Use the Prow Tekton controller!
    clone_uri: "[email protected]:OWNER/REPO.git"
    always_run: true
    skip_report: false
    rerun_command: /test hello-goodbye
    trigger: "(?m)^/test (all|hello-goodbye),?(\\s+|$)"
    pipeline_run_spec:
      serviceAccountName: prow-pipeline
      params:
        - name: some-name
          value: "Some value"
      pipelineRef:
        name: hello-goodbye
      resources:
      - name: source
        resourceRef:
          name: PROW_IMPLICIT_GIT_REF
      workspaces:
      - name: workspace
        emptydir: {}

Ressetkk avatar Feb 12 '23 01:02 Ressetkk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 13 '23 01:05 k8s-triage-robot

/remove-lifecycle stale It's still valid

Ressetkk avatar May 15 '23 07:05 Ressetkk

@Ressetkk

I did some investigation. I manually removed entire tekton_pipeline_run_spec.v1beta1 properties and added x-kubernetes-preserve-unknown-fields: true. ProwJob got applied and pipeline worked as expected.

Are you using such workaround? I found if I deleted the params properties, ProwJob did not have any params that I specified. I'm not sure how to pass params correctly with such workaround.

ileixe avatar Oct 23 '23 04:10 ileixe

@Ressetkk

I did some investigation. I manually removed entire tekton_pipeline_run_spec.v1beta1 properties and added x-kubernetes-preserve-unknown-fields: true. ProwJob got applied and pipeline worked as expected.

Are you using such workaround? I found if I deleted the params properties, ProwJob did not have any params that I specified. I'm not sure how to pass params correctly with such workaround.

Right now we stopped using tekton for now, but yes, it was working as expected. Tekton does the validation on the controller level and by default the CR retains any keys without validation on CRD level.

Ressetkk avatar Oct 25 '23 07:10 Ressetkk

@Ressetkk

I did some investigation. I manually removed entire tekton_pipeline_run_spec.v1beta1 properties and added x-kubernetes-preserve-unknown-fields: true. ProwJob got applied and pipeline worked as expected.

Are you using such workaround? I found if I deleted the params properties, ProwJob did not have any params that I specified. I'm not sure how to pass params correctly with such workaround.

Right now we stopped using tekton for now, but yes, it was working as expected. Tekton does the validation on the controller level and by default the CR retains any keys without validation on CRD level.

Thanks, it worked well after I deleted entire fields below v1beta1 (I deleted params under v1beta.properties only as it was the error I found).

Still I can't get why the CRD validation emits error while the data itself is valid following schema though, now we can use tekton-pipeline at least. Thanks for the hint.

ileixe avatar Oct 25 '23 10:10 ileixe

Left the working CRD for later: https://github.com/kyma-project/test-infra/blob/main/prow/cluster/components/prowjob_customresourcedefinition.yaml

ileixe avatar Oct 25 '23 10:10 ileixe

I believe it's also a problem on how Prow's CRD is generated. Instead of injecting entire tekton pipeline CRD this one line of code, and let tekton validate the CR, accepting it as-is in prow.

cc @petr-muller as it's also partially related to the excessive size of a CRD.

Ressetkk avatar Oct 25 '23 11:10 Ressetkk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 31 '24 01:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 01 '24 02:03 k8s-triage-robot

/remove-lifecycle rotten @Ressetkk I guess this is still valid, right? Would it make sense to make x-kubernetes-preserve-unknown-fields: true as a permanent fix?

jmguzik avatar Mar 01 '24 08:03 jmguzik

Yes, I believe so. I'm not sure if it's possible to add it instead of injecting entire tekton cr structure though... @jmguzik

Ressetkk avatar Mar 14 '24 00:03 Ressetkk