Optimize binding controllers with fine-grained event filtering to reduce redundant reconciliations
Optimize binding controllers with custom event predicate
Summary
Introduces a custom predicate for binding controllers to filter events and reduce unnecessary reconciliations:
- Ignores delete events (cleanup handled by finalizers)
- Always processes create events
- Processes updates only for meaningful changes (generation, DeletionTimestamp, key
.specfields)
Result: Better performance with no behavior changes.
Motivation
Current controllers watch resources with broad predicates, causing redundant reconciliations. Filtering irrelevant events reduces CPU/memory usage, API calls, and cache churn while preserving finalizer cleanup.
Changes
- Added:
ResourceBindingPredicate - Updated: Binding controllers to use the new predicate
- Added: Unit tests
Predicate Logic
- ✅ Create: Always processed
- ❌ Delete: Ignored
- ⚠️ Update: Only if generation changes, DeletionTimestamp is set, or critical
.specfields change (requiredBy,clusters,resource,gracefulEvictionTasks,suspension,preserveResourcesOnDeletion,conflictResolution)
Event Behavior
| Event Type | Before | After | Reason |
|---|---|---|---|
| Create | ✅ Process | ✅ Process | Always needed |
| Delete | ✅ Process | ❌ Ignore | Finalizers handle cleanup |
| Update (generation) | ✅ Process | ✅ Process | Meaningful change |
| Update (status only) | ✅ Process | ❌ Ignore | No reconciliation needed |
Files Changed
Added:
pkg/controllers/binding/binding_predicate.gopkg/controllers/binding/binding_predicate_test.go
Modified:
pkg/controllers/binding/binding_controller.gopkg/controllers/binding/cluster_resource_binding_controller.go
Testing
- Unit tests cover all event types
- Delete events ignored as expected
- Critical
.specchanges trigger reconciliation - Finalizer cleanup verified via DeletionTimestamp
Backward Compatibility
- No API changes
- Finalizer cleanup preserved
- Override policy watch behavior unchanged
Performance Impact
- Fewer reconciliations
- Reduced API calls and cache churn
- Lower CPU/memory usage
Linked Issue
Part of: #6780
Checklist
- [x] Unit tests added/passing
- [x] No breaking changes
- [x] Predicate applied to both RB and CRB controllers
- [x] Finalizer handling verified via DeletionTimestamp path
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign zach593 for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Summary of Changes
Hello @rohan-019, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request enhances the performance of ResourceBinding and ClusterResourceBinding controllers by implementing a custom event predicate. This predicate selectively processes events, preventing redundant reconciliation loops for changes that do not impact the core logic, thereby improving resource efficiency and responsiveness across the system.
Highlights
- Custom Event Predicate: Introduces a new
ResourceBindingPredicateto intelligently filter events forResourceBindingandClusterResourceBindingcontrollers. - Optimized Event Handling: Delete events are now ignored (cleanup handled by finalizers), create events are always processed, and update events are processed only for meaningful changes (generation, DeletionTimestamp, or critical
.specfields). - Performance Improvement: The fine-grained event filtering significantly reduces unnecessary reconciliations, leading to lower CPU/memory usage, fewer API calls, and reduced cache churn.
- No Behavior Change: The optimization is achieved without altering the existing functional behavior or backward compatibility of the controllers.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
:x: Patch coverage is 83.33333% with 8 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 45.87%. Comparing base (70411ab) to head (62be891).
:warning: Report is 19 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| pkg/controllers/binding/binding_predicate.go | 86.95% | 6 Missing :warning: |
| pkg/controllers/binding/binding_controller.go | 0.00% | 1 Missing :warning: |
| ...ers/binding/cluster_resource_binding_controller.go | 0.00% | 1 Missing :warning: |
| :exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality. |
Additional details and impacted files
@@ Coverage Diff @@
## master #6784 +/- ##
==========================================
+ Coverage 45.74% 45.87% +0.13%
==========================================
Files 689 691 +2
Lines 57161 57346 +185
==========================================
+ Hits 26149 26310 +161
- Misses 29383 29406 +23
- Partials 1629 1630 +1
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 45.87% <83.33%> (+0.13%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Hey, @XiShanYongYe-Chang Please take a look at this!
Thanks a lot @rohan-019
Thanks~ Can you help fix the lint error?
I tried several times to fix lint errors using golangci-lint run --fix pkg/controllers/binding/binding_predicate.go pkg/controllers/binding/cluster_resource_binding_controller.go and also attempted a manual fix, but ended up with lint errors. Could you help me figure out what’s going wrong?
Hi @rohan-019, since this PR is the first one in our series, I have quite a few review comments. ^-^
Hi @rohan-019, since this PR is the first one in our series, I have quite a few review comments. ^-^
Hi @XiShanYongYe-Chang , I really appreciate your feedback, especially since this is the first PR in the series. I’ll go through all your comments carefully and make the necessary updates.
Thanks again for taking the time to review.
Hi @rohan-019, the newly added ut has failed.
the newly added ut has failed.
Hey @XiShanYongYe-Chang , I initially removed the generation check based on your comment that the global event filter already handles it. However, after testing We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.
The reason is that once we override UpdateFunc with custom logic, the default GenerationChangedPredicate is no longer applied automatically. This means spec-only changes are not triggering reconciliation unless we explicitly handle the generation change inside our predicate.
To keep the behavior correct and align with the current tests, I’ve restored the != generation check in UpdateFunc.
We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.
This test case is problematic because it is unlikely that only the generation will change while other fields remain unchanged. Therefore, it needs to be redesigned or removed.
To keep the behavior correct and align with the current tests, I’ve restored the != generation check in UpdateFunc.
There's no need to do this.
We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.
This test case is problematic because it is unlikely that only the generation will change while other fields remain unchanged. Therefore, it needs to be redesigned or removed.
That makes sense, as the Generation changes are handled globally, so this predicate doesn't need to check for them. But instead of removing the generation check, I redesigned so that this predicate now focuses on catching edge cases where: -Generation hasn't changed BUT -Specific critical spec fields have changed (like RequiredBy, Clusters, etc.)
Hey @XiShanYongYe-Chang, Don't mind, I accidentally closed PR!!
Hi @XiShanYongYe-Chang
I tried simplifying the deletion check from
e.ObjectOld.GetDeletionTimestamp() == nil && e.ObjectNew.GetDeletionTimestamp() != nil
to
e.ObjectOld.GetDeletionTimestamp() != e.ObjectNew.GetDeletionTimestamp()
as suggested, but this change caused failures in the e2e tests (e.g., migration_and_rollback_test.go timing out).
The reason is that the simplified form triggers reconciliation not only when deletion starts (nil → non-nil), but also if the deletion timestamp is updated while the object is already marked for deletion. That extra trigger led to unexpected reconcile behavior and race conditions, which is why the e2e tests failed.
So it seems we need to keep the original, more restrictive check to ensure reconciliation only fires once at the beginning of deletion.
The reason is that the simplified form triggers reconciliation not only when deletion starts
(nil → non-nil), but also if the deletion timestamp is updated while the object is already marked for deletion. That extra trigger led to unexpected reconcile behavior and race conditions, which is why the e2e tests failed.
Once the timestamp is set, it will not be modified again.
The E2E error should be just an occasional issue and likely not introduced by the current changes. Let's ignore it for now.
Can you help fix the lint errors and compress the commits into one?
I guess this PR is part of #6780, so I updated the Fixes to Part of in the PR description.