pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

Termination message could not be parsed as JSON: parsing message json

Open pawelkopka opened this issue 2 years ago • 3 comments

Expected Behavior

Pipelines pass

Actual Behavior

When we run multiple pipelines (pipelineruns of the same pipeline), some of them hang on random tasks with events. In a pipeline with 18 tasks it happens to 1-2 tasks causing the pipeline to run until the timeout.

Warning  InternalError          3m (x19 over 22m)  TaskRun  2 errors occurred:
           * parsing message json: invalid character 'i' in literal false (expecting 'l')
           * parsing message json: invalid character 'i' in literal false (expecting 'l')

Some pipeline hang

Steps to Reproduce the Problem

  1. Run multiple pipelines at the same time (min 4 pipelines with 18 parallel tasks)

Additional Info

Logs from controller:

logger: "tekton-pipelines-controller"

message: "termination message could not be parsed as JSON: parsing message json: invalid character 'i' in literal false (expecting 'l')"

stacktrace: "github.com/tektoncd/pipeline/pkg/pod.setTaskRunStatusBasedOnStepStatus
	github.com/tektoncd/pipeline/pkg/pod/status.go:156
github.com/tektoncd/pipeline/pkg/pod.MakeTaskRunStatus
	github.com/tektoncd/pipeline/pkg/pod/status.go:135
github.com/tektoncd/pipeline/pkg/reconciler/taskrun.(*Reconciler).reconcile
	github.com/tektoncd/pipeline/pkg/reconciler/taskrun/taskrun.go:537
github.com/tektoncd/pipeline/pkg/reconciler/taskrun.(*Reconciler).ReconcileKind
	github.com/tektoncd/pipeline/pkg/reconciler/taskrun/taskrun.go:184
github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/taskrun.(*reconcilerImpl).Reconcile
	github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/taskrun/reconciler.go:235
knative.dev/pkg/controller.(*Impl).processNextWorkItem
	knative.dev/[email protected]/controller/controller.go:542
knative.dev/pkg/controller.(*Impl).RunContext.func3
	knative.dev/[email protected]/controller/controller.go:491"
  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12-gke.2300", GitCommit:"e55564cf3a1384026a54920174977659c8c56a50", GitTreeState:"clean", BuildDate:"2022-08-16T09:24:51Z", GoVersion:"go1.16.15b7", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.24) and server (1.22) exceeds the supported minor version skew of +/-1
  • Tekton Pipeline version:
v0.40.2

When we run a single pipeline it works correctly, the issue occurs when we run multiple pipelines at the same time.

pawelkopka avatar Oct 11 '22 09:10 pawelkopka

Could you please get the pod information through the command kubectl get pod ${podname} -o yaml

maybe the error is due to the termination messages are too large, or the termination messages were not controllered by entrypoint

chengjoey avatar Oct 11 '22 16:10 chengjoey

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot avatar Jan 09 '23 17:01 tekton-robot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten with a justification. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot avatar Feb 08 '23 17:02 tekton-robot

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot avatar Mar 10 '23 17:03 tekton-robot

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Mar 10 '23 17:03 tekton-robot