OCPBUGS-42805: Add node caching with Kubernetes watch API to reduce API load
Replace frequent node LIST calls with a watch-based cache in monitor loops. Implements NodeWatcher similar to existing loggingconfig watcher pattern.
- Add pkg/nodeconfig with NodeWatcher and NodeCacheGetter interface
- Refactor node retrieval functions to use cache when available
- Update monitors (dynkeepalived, coredns) to use NodeWatcher
- Reduce API calls from hundreds/min to single watch connection
- Maintain backward compatibility with nil cache fallback
Fixes: OCPBUGS-42805
@mkowalski: This pull request references Jira Issue OCPBUGS-42805, which is invalid:
- expected the bug to target the "4.21.0" version, but no target version was set
Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.
The bug has been updated to refer to the pull request using the external bug tracker.
In response to this:
Replace frequent node LIST calls with a watch-based cache in monitor loops. Implements NodeWatcher similar to existing loggingconfig watcher pattern.
- Add pkg/nodeconfig with NodeWatcher and NodeCacheGetter interface
- Refactor node retrieval functions to use cache when available
- Update monitors (dynkeepalived, coredns) to use NodeWatcher
- Reduce API calls from hundreds/min to single watch connection
- Maintain backward compatibility with nil cache fallback
Fixes: OCPBUGS-42805
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/test e2e-metal-ipi-ovn-dualstack /test e2e-metal-ipi-ovn-ipv4
/hold
I don't want it for 4.21; will be on hold till branch cut and should land only in 4.22
/test e2e-metal-ipi-ovn-dualstack /test e2e-metal-ipi-ovn-ipv4
Conformance failures
{ fail [github.com/openshift/origin/test/extended/apiserver/tls.go:151]: Expected success true, got false with TLS version VersionTLS12 dialing master}
@mkowalski: all tests passed!
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
/lgtm
Looks like a good way to address a longstanding issue!
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: cybertron, mkowalski
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [cybertron,mkowalski]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment