pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

feat(metrics): Migrate from OpenCensus to OpenTelemetry

Open khrm opened this issue 3 months ago • 8 comments

  • Updated pipelinerunmetrics and taskrunmetrics to use OpenTelemetry instruments (histograms, counters, gauges) for creating and recording metrics. Introduced new OpenTelemetry configurations in config/config-observability.yaml for exporters and protocols.. Rewrote the test suites for pipelinerunmetrics and taskrunmetrics to be compatible with the new OpenTelemetry-based implementation.
  • Updated knative to 1.19

Changes

fixes https://github.com/tektoncd/pipeline/issues/8969

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • [ ] Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • [ ] Has Tests included if any functionality added or changed
  • [ ] pre-commit Passed
  • [ ] Follows the commit message standard
  • [ ] Meets the Tekton contributor standards (including functionality, content, code)
  • [ ] Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • [ ] Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • [ ] Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

NONE

/kind feature

khrm avatar Sep 29 '25 10:09 khrm

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/metrics.go 86.7% 68.8% -17.9
pkg/reconciler/pipelinerun/pipelinerun.go 91.6% 91.6% -0.0
pkg/taskrunmetrics/metrics.go 87.3% 72.8% -14.5

tekton-robot avatar Sep 29 '25 10:09 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/metrics.go 86.7% 68.8% -17.9
pkg/reconciler/pipelinerun/pipelinerun.go 91.6% 91.6% -0.0
pkg/taskrunmetrics/metrics.go 87.3% 72.8% -14.5

tekton-robot avatar Sep 29 '25 11:09 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/metrics.go 86.7% 68.8% -17.9
pkg/reconciler/pipelinerun/pipelinerun.go 91.6% 91.6% -0.0
pkg/taskrunmetrics/metrics.go 87.3% 72.8% -14.5

tekton-robot avatar Sep 29 '25 11:09 tekton-robot

/retest

waveywaves avatar Sep 29 '25 12:09 waveywaves

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/metrics.go 86.7% 68.8% -17.9
pkg/reconciler/pipelinerun/pipelinerun.go 91.6% 91.6% -0.0
pkg/taskrunmetrics/metrics.go 87.3% 72.8% -14.5

tekton-robot avatar Sep 29 '25 13:09 tekton-robot

/retest

waveywaves avatar Sep 29 '25 19:09 waveywaves

Re knative bump, we discussed multiple times that we want https://github.com/knative/pkg/commit/04fdd0bbdf0d1be771598dcac914fd906d8f7622 included in the next bump, even though it only exists in main for now (not even in 1.20). It'll unlock the better finalizer management which proved to be an issue for Tekton deployments with multiple controllers managing the same PR/TR resources. So do we want to bump knative/pkg even higher?

enarha avatar Dec 15 '25 12:12 enarha

/assign @vdemeester @waveywaves

khrm avatar Dec 16 '25 02:12 khrm

@enarha I have updated to latest knative/pkg for now.

khrm avatar Dec 16 '25 02:12 khrm

Why are ci tests being skipped?

khrm avatar Dec 16 '25 02:12 khrm

/kind feature

khrm avatar Dec 16 '25 03:12 khrm

/retest

khrm avatar Dec 16 '25 03:12 khrm

Lint is failing due to older changes.

   failed to fetch pull request patch: RequestError [HttpError]: Sorry, the diff exceeded the maximum number of files (300). Consider using 'List pull requests files' API or locally cloning the repository instead.: {"resource":"PullRequest","field":"diff","code":"too_large"} - https://docs.github.com/rest/pulls/pulls#list-pull-requests-files

khrm avatar Dec 16 '25 03:12 khrm

Grep isn't working correctly. Maybe I would raise a separate PR for workflow fix. Or it can be part of this,

khrm avatar Dec 16 '25 03:12 khrm

ko apply is failing due to The CustomResourceDefinition "tasks.tekton.dev" is invalid: metadata.annotations: Too long: may not be more than 262144 bytes

Now, only ko create works. Will fix that also.

khrm avatar Dec 16 '25 04:12 khrm

@vdemeester @waveywaves Let's review and merge this.

khrm avatar Dec 17 '25 18:12 khrm

/assign @twoGiants

khrm avatar Dec 17 '25 18:12 khrm

/hold

I found an issue while testing. Seems some mistake while rebasing.

khrm avatar Dec 18 '25 11:12 khrm

/hold

I found an issue while testing. Seems some mistake while rebasing.

khrm avatar Dec 18 '25 11:12 khrm

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please ask for approval from twogiants after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

tekton-robot avatar Dec 18 '25 14:12 tekton-robot

/hold cancel

khrm avatar Dec 18 '25 14:12 khrm

I tested this. It's working fine. @waveywaves @vdemeester

khrm avatar Dec 18 '25 14:12 khrm

@tektoncd/chains-maintainers @tektoncd/triggers-maintainers @tektoncd/results-maintainers You can use the commits in this PR as steps to update knative/pkg and migrate to otel.

khrm avatar Dec 18 '25 14:12 khrm

@tektoncd/chains-maintainers @tektoncd/triggers-maintainers @tektoncd/results-maintainers You can use the commits in this PR as steps to update knative/pkg and migrate to otel.

@waveywaves @vdemeester @twoGiants Let's review this so that other components are unblocked.

khrm avatar Dec 19 '25 12:12 khrm

cc: @divyansh42 (results) @infernus01 (triggers) @anithapriyanatarajan (chains)

anithapriyanatarajan avatar Dec 19 '25 13:12 anithapriyanatarajan

@khrm: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Dec 22 '25 17:12 tekton-robot

Another question: knative updates often bring in a new min k8s version as well. Is that not the case for this PR?

Yes. The minimum K8s version will increase.

khrm avatar Dec 23 '25 09:12 khrm