pipeline icon indicating copy to clipboard operation
pipeline copied to clipboard

[TEP-0137] Rework the events controller cache

Open afrittoli opened this issue 2 years ago • 9 comments

Dependencies:

  • [ ] #6883
  • [ ] #6884
  • [ ] #6889

Once dependencies are merged, the PR will contain 1 commit only

Changes

Rework the events controller cache

Switch the cache implementation from a simple LRU to the bigcache library which provides better performance and reduces the need for garbage collection.

Further improve the cache by reducing the size of keys, by adopting hashing of keys and thus storing smaller data in the cache. Hashing also allows for detecting changes in a larger surface of the data. While todays keys are based on the event type and resource metadata, we can now include conditions in the equation as well, to provide and enhanced logic for event sending.

Storing a resource metadata and its condition in the hashed keys means we can detect immediately if no relevant change was made in a resource, and quickly finish the reconcile cycle. Additionally storing the event type along with the resource metadata, means that we can detect if a specific message was already sent for a specific resource.

To be able to rely on the cache to decide which event to send, we need to ensure that reconcile cycles for the same resource are never executed concurrently. To achieve this, the thread number (or concurrency) is set to 1 in the controller.

Scalability needs can be addressed by scaling the events controller horizontally. The sharding of resources across instances of the reconciler is based on the resource key, so the same resource will always be directed to the same controller thus avoiding any cache miss.

Signed-off-by: Andrea Frittoli [email protected]

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • [ ] Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • [x] Has Tests included if any functionality added or changed
  • [x] Follows the commit message standard
  • [x] Meets the Tekton contributor standards (including functionality, content, code)
  • [x] Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • [x] Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • [ ] Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

NONE

/kind misc

afrittoli avatar Jul 10 '23 12:07 afrittoli

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please ask for approval from afrittoli after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

tekton-robot avatar Jul 10 '23 12:07 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1/pipelinerun_types.go 89.5% 88.7% -0.8
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 89.7% 88.9% -0.8
pkg/reconciler/events/cache/cache.go 83.3% 72.9% -10.4
pkg/reconciler/events/cache/cacheclient.go 7.1% 6.2% -0.9
pkg/reconciler/events/cloudevent/cloud_event_controller.go 84.6% 83.6% -1.0
pkg/reconciler/notifications/controller.go Do not exist 25.0%
pkg/reconciler/notifications/customrun/reconciler.go Do not exist 87.5%
pkg/reconciler/notifications/runtimeobject.go Do not exist 100.0%

tekton-robot avatar Jul 10 '23 12:07 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1/pipelinerun_types.go 89.5% 88.7% -0.8
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 89.7% 88.9% -0.8
pkg/reconciler/events/cache/cache.go 83.3% 72.9% -10.4
pkg/reconciler/events/cache/cacheclient.go 7.1% 6.2% -0.9
pkg/reconciler/events/cloudevent/cloud_event_controller.go 84.6% 83.6% -1.0
pkg/reconciler/notifications/controller.go Do not exist 25.0%
pkg/reconciler/notifications/customrun/reconciler.go Do not exist 87.5%
pkg/reconciler/notifications/runtimeobject.go Do not exist 100.0%

tekton-robot avatar Jul 10 '23 12:07 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1/pipelinerun_types.go 89.5% 88.7% -0.8
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 89.7% 88.9% -0.8
pkg/reconciler/events/cache/cache.go 83.3% 72.9% -10.4
pkg/reconciler/events/cache/cacheclient.go 7.1% 6.7% -0.5
pkg/reconciler/events/cloudevent/cloud_event_controller.go 84.6% 83.6% -1.0
pkg/reconciler/notifications/controller.go Do not exist 25.0%
pkg/reconciler/notifications/customrun/reconciler.go Do not exist 87.5%
pkg/reconciler/notifications/runtimeobject.go Do not exist 100.0%

tekton-robot avatar Jul 11 '23 13:07 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1/pipelinerun_types.go 89.5% 88.7% -0.8
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 89.7% 88.9% -0.8
pkg/reconciler/events/cache/cache.go 83.3% 72.9% -10.4
pkg/reconciler/events/cache/cacheclient.go 7.1% 6.7% -0.5
pkg/reconciler/events/cloudevent/cloud_event_controller.go 84.6% 83.6% -1.0
pkg/reconciler/notifications/controller.go Do not exist 25.0%
pkg/reconciler/notifications/customrun/reconciler.go Do not exist 87.5%
pkg/reconciler/notifications/runtimeobject.go Do not exist 100.0%

tekton-robot avatar Jul 11 '23 13:07 tekton-robot

@afrittoli: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-tekton-pipeline-unit-tests 06cfd976646a03229a5c7628f48c8441a54f9771 link true /test tekton-pipeline-unit-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

tekton-robot avatar Jul 11 '23 13:07 tekton-robot

@afrittoli: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Aug 08 '23 20:08 tekton-robot

@afrittoli do you think we can get this in for this milestone? Or should I move it over to 0.57?

chitrangpatel avatar Jan 10 '24 15:01 chitrangpatel

This PR has not made any progress since Pipelines v0.51 when it was initiated introduced. Clearing it from the milestone. Please add back as appropriate. Thanks!

pritidesai avatar Mar 20 '24 16:03 pritidesai