clusterpedia icon indicating copy to clipboard operation
clusterpedia copied to clipboard

Minimizing cpu/memory usage during times of high resource churn

Open TNonet opened this issue 8 months ago • 1 comments

What would you like to be added?

Once in a while, in our K8s cluster, we see periods of high resource churn, which pushes lots of events to the cluster's clustersynchroManager resulting in large CPU spikes. What is the recommended way to ensure that this leader elected pod is able to handle the load?

  • Do I increase worker count? What is to high / too low?
  • Can we shard clustersynchroManager by resource uid so multiple pods can handle the work?
    • Is that this offering? https://github.com/clusterpedia-io/clusterpedia/pull/609 Can there be more docs / examples for how this works?
  • Can we prune how/which updates the clustersynchroManager watches / listens for? I see there is a feature to prune fields but can that be expands / made more dynamic to suit specific needs? https://clusterpedia.io/docs/features/prune-fields/

Why is this needed?

Given the single pod nature of clustersynchroManager, it is important to make sure the pod can handle the rate of updates needed to sync the cluster

TNonet avatar Feb 21 '25 18:02 TNonet