Pipelines never get triggered
With recent updates to Tekton (pipelines: v1.0) some of my pipelines (created through PAC) will fail to start. The pipelines will get created but the status will be: PipelineRunPending and the stuck pipelines never progress.
The kubernetes system has this error:
admission webhook "validation.webhook.pipeline.tekton.dev" denied the request: validation failed: invalid value: Once the PipelineRun has started, only status updates are allowed: spec
I am not sure why some pipelines have this issue and others dont, but as of now the only way to proceed is for me to delete the
tekton validation.webhook.pipeline.tekton.dev and webhook.pipeline.tekton.dev
Here is an example of my pipelinerun (inside the .tekton/ directory):
---
apiVersion: tekton.dev/v1
kind: PipelineRun
metadata:
name: pr
labels:
pipeline-run-name: my-pr
annotations:
karpenter.sh/do-not-disrupt: "true"
# The event we are targeting as seen from the webhook payload
# this can be an array too, i.e: [pull_request, push]
pipelinesascode.tekton.dev/on-event: "[pull_request]"
# The branch or tag we are targeting (ie: main, refs/tags/*)
pipelinesascode.tekton.dev/on-target-branch: "[main, release/*]"
# Fetch the git-clone task from hub, we are able to reference later on it
# with taskRef and it will automatically be embedded into our pipeline.
pipelinesascode.tekton.dev/task: ".tekton/tasks/custom-vars.yaml" # TODO: Remove once CI is merged
pipelinesascode.tekton.dev/task-1: ".tekton/tasks/git-clone.yaml"
pipelinesascode.tekton.dev/task-2: ".tekton/tasks/read-config.yaml"
pipelinesascode.tekton.dev/task-3: ".tekton/tasks/helm-publish.yaml"
pipelinesascode.tekton.dev/task-4: "kubernetes-actions"
pipelinesascode.tekton.dev/task-5: ".tekton/tasks/build-web-assets.yaml"
pipelinesascode.tekton.dev/task-6: ".tekton/tasks/start-world.yaml"
pipelinesascode.tekton.dev/task-7: ".tekton/tasks/gh-comment-image-scanner.yaml"
pipelinesascode.tekton.dev/task-8: ".tekton/tasks/sonarqube-scanner.yaml"
pipelinesascode.tekton.dev/task-9: ".tekton/tasks/unit-tests.yaml"
pipelinesascode.tekton.dev/task-10: "github-set-status"
pipelinesascode.tekton.dev/task-11: ".tekton/tasks/main-deploy.yaml"
# How many runs we want to keep.
pipelinesascode.tekton.dev/max-keep-runs: "5"
spec:
params:
# The variable with brackets are special to Pipelines as Code
# They will automatically be expanded with the events from Github.
# https://pipelinesascode.com/docs/guide/authoringprs/#default-parameters
- name: url
value: "{{ repo_url }}"
- name: revision
value: "{{ revision }}"
- name: repo_name
value: "{{ repo_name }}"
- name: branch_name
value: "{{ source_branch }}"
- name: pull_request_number
value: "{{ pull_request_number }}"
- name: pull_request_base_ref
value: "{{ body.pull_request.base.ref }}"
- name: pull_request_url
value: "{{ body.pull_request.html_url }}"
- name: assets_env
value: ""
- name: clone_url
value: "{{ repo_url }}"
- name: pipeline_label_selector
value: "pipeline-run-name=my-pr,pipelinesascode.tekton.dev/state=started,pipelinesascode.tekton.dev/state!=completed,pipelinesascode.tekton.dev/state!=failed,pipelinesascode.tekton.dev/state!=cancelled,pipelinesascode.tekton.dev/sha!={{ revision }},pipelinesascode.tekton.dev/pull-request=={{ pull_request_number }}"
# This works.. and only affects build-images
taskRunTemplate:
serviceAccountName: cicd
podTemplate:
nodeSelector:
nodes.io/node-role: iops
tolerations:
- effect: NoExecute
key: role
operator: Equal
value: iops
securityContext:
fsGroup: 65532
fsGroupChangePolicy: OnRootMismatch
taskRunSpecs:
- pipelineTaskName: fetch-repository
computeResources:
requests:
cpu: 2
memory: 256Mi
- pipelineTaskName: build-images
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
stepSpecs:
- name: get-skaffold-output-images
computeResources:
requests:
cpu: 1
memory: 256Mi
- name: build-image
computeResources:
requests:
cpu: 2
memory: 13Gi
- name: write-digest
computeResources:
requests:
cpu: 1
memory: 256Mi
- name: digest-to-results
computeResources:
requests:
cpu: 1
memory: 256Mi
- pipelineTaskName: rubocop
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
computeResources:
requests:
cpu: 2
memory: 2Gi
- pipelineTaskName: unit-test
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
stepSpecs:
- name: run-unit-test
computeResources:
requests:
cpu: 2
memory: 2Gi
- name: check-test-results
computeResources:
requests:
cpu: 1
memory: 256Mi
- pipelineTaskName: sonarqube-scanner
metadata:
annotations:
karpenter.sh/do-not-disrupt: "true"
computeResources:
requests:
cpu: 2
memory: 2Gi
pipelineRef:
name: ci
timeouts:
pipeline: "1h30m00s"
workspaces:
- name: sonar_cache
persistentVolumeClaim:
claimName: efs-tekton-direct-cicd
subPath: "CI/my/sonar_cache"
- name: source
volumeClaimTemplate:
spec:
storageClassName: efs-tekton-sc-dynamic
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
- name: dockerconfig
secret:
secretName: docker-credentials
- name: scratch
volumeClaimTemplate:
spec:
storageClassName: efs-tekton-sc-dynamic
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
# This workspace will inject secret to help the git-clone task to be able to
# checkout the private repositories
- name: basic-auth
secret:
secretName: "{{ git_auth_secret }}"
At this point im not sure what exactly is trying to be modified, but it only happens with this pipelinerun, and only through pipelinesascode.
@vdemeester any idea?
🤔 we should try to reproduce this 🤔
@chmouel given that there is no spec.status in the above Pipeline definition, I guess, pac is setting the status to PipelineRunPending right ?
I wonder if there could be a race (or at least a bug) where the webhook (or the controller) thinks the Pipeline has start when it didn't…
is concurrency used ?
@jwitrick can you share the full yaml of the object ? I am interested into seeing the status, because if it's a race, it might be "seen" as started from the Pipeline controller perspective even though it has the pending status.
Also, what pipelines-as-code are you running ?
(I tried to replicate simply by setting the PipelineRunPending when creating the object, and it just works. I guess, if it's using pac concurrency, and it's the watcher that sets the PipelineRunPending, something might go wrong.