clusterpedia icon indicating copy to clipboard operation
clusterpedia copied to clipboard

Minimizing cpu/memory usage during times of high resource churn

Open TNonet opened this issue 9 months ago • 1 comments

What would you like to be added?

Once in a while, in our K8s cluster, we see periods of high resource churn, which pushes lots of events to the cluster's clustersynchroManager resulting in large CPU spikes. What is the recommended way to ensure that this leader elected pod is able to handle the load?

  • Do I increase worker count? What is to high / too low?
  • Can we shard clustersynchroManager by resource uid so multiple pods can handle the work?
    • Is that this offering? https://github.com/clusterpedia-io/clusterpedia/pull/609 Can there be more docs / examples for how this works?
  • Can we prune how/which updates the clustersynchroManager watches / listens for? I see there is a feature to prune fields but can that be expands / made more dynamic to suit specific needs? https://clusterpedia.io/docs/features/prune-fields/

Why is this needed?

Given the single pod nature of clustersynchroManager, it is important to make sure the pod can handle the rate of updates needed to sync the cluster

TNonet avatar Feb 21 '25 18:02 TNonet

Hi @TNonet, Thanks for opening an issue! We will look into it as soon as possible.

Details

Instructions for interacting with me using comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the gh-ci-bot repository.

clusterpedia-bot avatar Feb 21 '25 18:02 clusterpedia-bot