pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

Pipeline failing with "invalid pipelineresults" when referencing results of skipped tasks

Open thomascube opened this issue 2 years ago • 13 comments

Expected Behavior

According to the documentation pipeline results can contain references to task results. If a task did not run because of it's whenConditions, the pipeline result is not emitted but the pipeline succeeds. This worked with v0.37.5 of Tekton Pipelines available on OpenShift.

Actual Behavior

With the update of the RedHat OpenShift Pipelines Operator, which brings Tekton Pipeline v0.41.0, pipelines with skipped tasks and composed results now fail with the message invalid pipelineresults [result-goodbye], the referred results don't exist.

Steps to Reproduce the Problem

Run the following PipelineRun:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipelinerun-test
spec:
  pipelineSpec:
    params:
      - name: say-hello
        default: 'false'
    tasks:
      - name: hello
        taskSpec:
          results:
            - name: result-one
          steps:
          - image: alpine
            script: |
              #!/bin/sh
              echo "Hello world!"
              echo -n "RES1" > $(results.result-one.path)
  
      - name: goodbye
        runAfter:
          - hello
        taskSpec:
          results:
            - name: result-two
          steps:
          - image: alpine
            script: |
              #!/bin/sh
              echo "Goodbye world!"
              echo -n "RES2" > $(results.result-two.path)
        when:
          - input: $(params.say-hello)
            operator: in
            values: ["true"]

    results:
      - name: result-hello
        description: Result one
        value: '$(tasks.hello.results.result-one)'
      - name: result-goodbye
        description: Result two
        value: '$(tasks.goodbye.results.result-two)'

Additional Info

  • Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4+a34b9e9", GitCommit:"b6d1f054747e9886f61dd85316deac3415e2726f", GitTreeState:"clean", BuildDate:"2023-01-10T15:55:28Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
  • Tekton Pipeline version:
Pipeline version: v0.41.0
Triggers version: v0.22.0
Openshift Pipelines Version: 1.9.0

thomascube avatar Feb 09 '23 16:02 thomascube

Success on v0.37.5: pipeline-success-v0 37 5

Failure on v0.41.0: pipeline-failure-v0 41 0

thomascube avatar Feb 09 '23 17:02 thomascube

Hi @thomascube, thanks! I think the validation is added intentionally, do you think it makes more sense to remove the check? Since I may lack the knowledge of the use cases, I'd like to learn more about your opinons.

Yongxuanzhang avatar Feb 09 '23 18:02 Yongxuanzhang

@Yongxuanzhang We would like to display these values in our UI tool (if they are available). Is there any way to default to empty, if the task was not run? e.g. something like '$(tasks.hello.results.result-one || '')' or '$(tasks.hello && tasks.hello.results.result-one)'?

kyubisation avatar Feb 09 '23 19:02 kyubisation

@kyubisation we don't have default result yet but it's something that @vinamra28 has been looking into -- considering a syntax similar to the first one you suggested: '$(tasks.hello.results.result-one || '')'

jerop avatar Feb 09 '23 19:02 jerop

@Yongxuanzhang We would like to display these values in our UI tool (if they are available). Is there any way to default to empty, if the task was not run? e.g. something like '$(tasks.hello.results.result-one || '')' or '$(tasks.hello && tasks.hello.results.result-one)'?

I think default results would be a solution. But it may take much longer time?

I wonder before this update, if the results are not available, there will be nothing showing in UI? The reason I add the check is to help debugging. So you would know which results are not emitted. I'm thinking if there is a better way to meet both needs? e.g. We don't fail the pipelinerun but only log the warning message?

Or actually we don't want to have any check/validation on this? If so I think we could just remove the validation.

@kyubisation @jerop any suggestions?

Yongxuanzhang avatar Feb 09 '23 20:02 Yongxuanzhang

@Yongxuanzhang I think the old behavior which is also described in the documentation under "A Pipeline Result is not emitted if any of the following are true" makes perfectly sense and we were running fine with it. With the new check and the failure of the pipeline it's impossible to use task results in conjunction with when conditions.

thomascube avatar Feb 10 '23 08:02 thomascube

@Yongxuanzhang Having validation is reasonable, but directly breaking the pipeline without a workaround/migration possibility is not ideal. I'd prefer to have this either be opt-in or have to have a fallback/default available at the same time.

kyubisation avatar Feb 10 '23 09:02 kyubisation

@Yongxuanzhang I think the old behavior which is also described in the documentation under "A Pipeline Result is not emitted if any of the following are true" makes perfectly sense and we were running fine with it. With the new check and the failure of the pipeline it's impossible to use task results in conjunction with when conditions.

In the doc you linked it says:

A PipelineTask referenced by the Pipeline Result didn’t emit the referenced Task Result. This should be considered a bug in the Task and may fail a PipelineTask in future.

It looks like we want to fail the run if the results are not emitted but referenced? You case is like the results may or may not emit and it depends on the task's when condition.

I'm ok with @kyubisation's suggestion.

Yongxuanzhang avatar Feb 10 '23 18:02 Yongxuanzhang

@Yongxuanzhang apparently I read the following statement differently

This should be considered a bug in the Task and may fail a PipelineTask in the future.

For me this sounds like a case where a task specifies a result but no write to that result file was made in any of the steps. I didn't make the connection to skipped tasks.

Anyway, you asked about my use-case: we have pipelines for building software projects such as a Java Maven application. The pipelines includes tasks for build the jar, testing it, running quality checks (sonar scan) and creating a Docker image from it. Some tasks are optional (e.g. sonar scan or docker build) and can be enabled/disabled by pipeline parameters consumed by when expressions guarding the task execution. These tasks also emit results (if executed) which are mapped into pipeline results. If for example a docker image was built, the image name and tag are visible in the pipeline results and vice versa: if the docker build was disabled by parameter, no such pipeline result exists.

Or a bit shorter: we run pipelines with optional tasks that yield optional results.

I was just surprised to suddenly see pipeline fail although all tasks have executed successfully. For possible solutions I can just copy @kyubisation's suggestions.

thomascube avatar Feb 12 '23 21:02 thomascube

Thanks for your input and suggestions! @thomascube @kyubisation
I have a better understanding of the use cases. I will fix it soon! Will keep you updated. :D

Yongxuanzhang avatar Feb 13 '23 18:02 Yongxuanzhang

/assign I believe https://github.com/tektoncd/community/pull/954 is a long term solution. But we should fix this to unblock the use cases

Yongxuanzhang avatar Feb 13 '23 18:02 Yongxuanzhang

I have opened this PR to fix it. @kyubisation @thomascube. Plz take a look 🙏 and let me know if this works for you?

Yongxuanzhang avatar Feb 13 '23 19:02 Yongxuanzhang

we run pipelines with optional tasks that yield optional results

@thomascube what do you mean by optional tasks? is it that they are guarded using when expressions?

don't think results were ever meant to be optional, we were planning to enforce it (https://github.com/tektoncd/pipeline/issues/3497) but looks like users have built around the current behavior 🤔 -- cc @pritidesai @vinamra28

jerop avatar Feb 13 '23 21:02 jerop

we run pipelines with optional tasks that yield optional results

@thomascube what do you mean by optional tasks? is it that they are guarded using when expressions?

don't think results were ever meant to be optional, we were planning to enforce it (#3497) but looks like users have built around the current behavior 🤔 -- cc @pritidesai @vinamra28

@jerop Do you think this worth discussing in a wg or some other way? I think this is about how we handle skipped tasks results. Options:

  1. fail the pipelinerun
  2. skip the validation and log the messages

either way I think we will still need https://github.com/tektoncd/community/pull/954

Yongxuanzhang avatar Feb 14 '23 19:02 Yongxuanzhang

@Yongxuanzhang just want to confirm, it seems like this regression was introduced in https://github.com/tektoncd/pipeline/pull/5088, i.e. it was not the result of an intentional change to add more validation to pipeline results?

lbernick avatar Mar 15 '23 23:03 lbernick

Hi @kyubisation @thomascube, the fix PR is merged, so the release v0.46 should have this fix. And I will cherry pick this PR into previous releases. I think you're using v0.41.0 ? Maybe we could reopen this issue or I create a new one to track it?

Yongxuanzhang avatar Mar 16 '23 17:03 Yongxuanzhang

@Yongxuanzhang Thanks for the work and the update. We are currently using 0.37.x and would like to upgrade to 0.41.x, so a backport would be very appreciated. If the backport is planned/scheduled soonish, I don't need a separate issue to track it. If not, then reopening this issue should be enough.

kyubisation avatar Mar 16 '23 18:03 kyubisation

@Yongxuanzhang Thanks for the work and the update. We are currently using 0.37.x and would like to upgrade to 0.41.x, so a backport would be very appreciated. If the backport is planned/scheduled soonish, I don't need a separate issue to track it. If not, then reopening this issue should be enough.

Thanks! I will post the updates in this issue!

Yongxuanzhang avatar Mar 16 '23 18:03 Yongxuanzhang

Hello. We just upgraded to Tekton v0.47.5 (via OpenShift Pipelines) and now this issue is back!

Our pipeline has defined a result which copies the value of a task result:

  results:
    - description: The URL to the generated junit test report
      name: REPORT_URL
      value: $(finally.generate-report.results.REPORT_URL)

Now if the task generate-report is skipped due to a when condition, the pipeline fails with the following error:

invalid pipelineresults [REPORT_URL], the referred results don't exist

With Tekton v0.44.2 the same pipeline terminated without an error.

thomascube avatar Nov 16 '23 13:11 thomascube

Hello. We just upgraded to Tekton v0.47.5 (via OpenShift Pipelines) and now this issue is back!

Our pipeline has defined a result which copies the value of a task result:

  results:
    - description: The URL to the generated junit test report
      name: REPORT_URL
      value: $(finally.generate-report.results.REPORT_URL)

Now if the task generate-report is skipped due to a when condition, the pipeline fails with the following error:

invalid pipelineresults [REPORT_URL], the referred results don't exist

With Tekton v0.44.2 the same pipeline terminated without an error.

So there's are no issues with v0.47.4, just trying to know which release introduce the new change

Yongxuanzhang avatar Nov 16 '23 15:11 Yongxuanzhang

@Yongxuanzhang we upgraded from v0.44.2 to v0.47.5 (this is what the Openshift Pipelines operator has bundled in its releases). Thus I don’t know in which version in between the old behavior came back.

thomascube avatar Nov 16 '23 22:11 thomascube

@Yongxuanzhang we upgraded from v0.44.2 to v0.47.5 (this is what the Openshift Pipelines operator has bundled in its releases). Thus I don’t know in which version in between the old behavior came back.

Ok I will take a look

Yongxuanzhang avatar Nov 16 '23 23:11 Yongxuanzhang

To be more precise, the issue only occurs when results of a skipped task from the finally section are involved. Regular tasks which are skipped to not trigger the said validation error but the following test pipeline does:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  generateName: pipelinerun-test-
spec:
  pipelineSpec:
    params:
      - name: say-goobye
        default: 'false'
      - name: do-shutdown
        default: 'false'
    tasks:
      - name: hello
        taskSpec:
          results:
            - name: result-one
          steps:
          - image: alpine
            script: |
              #!/bin/sh
              echo "Hello world!"
              echo -n "RES1" > $(results.result-one.path)
  
      - name: goodbye
        runAfter:
          - hello
        taskSpec:
          results:
            - name: result-two
          steps:
          - image: alpine
            script: |
              #!/bin/sh
              echo "Goodbye world!"
              echo -n "RES2" > $(results.result-two.path)
        when:
          - input: $(params.say-goobye)
            operator: in
            values: ["true"]

    finally:
      - name: shutdown
        taskSpec:
          results:
            - name: result-three
          steps:
          - image: alpine
            script: |
              #!/bin/sh
              echo "Shutdown world!"
              echo -n "RES3" > $(results.result-three.path)
        when:
          - input: $(params.do-shutdown)
            operator: in
            values: ["true"]

    results:
      - name: result-hello
        description: Result one
        value: '$(tasks.hello.results.result-one)'
      - name: result-goodbye
        description: Result two
        value: '$(tasks.goodbye.results.result-two)'
      - name: result-shutdown
        description: Result shutdown
        value: '$(finally.shutdown.results.result-three)'

I'll now test through the versions between v0.44.2 and v0.47.5 to figure out the exaction version the behavior changed.

thomascube avatar Nov 21 '23 08:11 thomascube

All right, I now tested through the releases and the given example pipeline still succeeds in v0.47.2 but starts failing with v0.47.3. I hope this helps you tracking down the change. It still fails with v0.50.2 though.

thomascube avatar Nov 21 '23 09:11 thomascube

Thanks! Looking into it

Yongxuanzhang avatar Nov 21 '23 21:11 Yongxuanzhang

I suspect the skipped finally tasks are not taken into consideration in the previous fix.

Yongxuanzhang avatar Nov 21 '23 21:11 Yongxuanzhang

I suspect the skipped finally tasks are not taken into consideration in the previous fix.

But it worked in versions until v0.47.2...

thomascube avatar Nov 22 '23 08:11 thomascube

I suspect the skipped finally tasks are not taken into consideration in the previous fix.

But it worked in versions until v0.47.2...

There are 2 examples, 1 is your original example without final task, and the other with final task. The one without final task has no issues running on latest main. And I think the final task is not working since the validation was introduced

Yongxuanzhang avatar Nov 23 '23 00:11 Yongxuanzhang

@thomascube I will create a fix for this, which Tekton Pipeline Version do we need this fix? I wonder when we have the fix, which major versions we need to patch on to unblock you. We should support the LTS which are: 44, 47, 50, 53. Is that ok?

Yongxuanzhang avatar Nov 23 '23 00:11 Yongxuanzhang

@thomascube I will create a fix for this, which Tekton Pipeline Version do we need this fix? I wonder when we have the fix, which major versions we need to patch on to unblock you. We should support the LTS which are: 44, 47, 50, 53. Is that ok?

That's perfect. We consume Tekton via OpenShift Pipelines (>= 1.11) where v0.47.x and v0.50.x versions are deployed: https://docs.openshift.com/pipelines/1.12/about/op-release-notes.html#compatibility-support-matrix_op-release-notes

thomascube avatar Nov 23 '23 15:11 thomascube