cluster-api-provider-openstack Document guidelines around CAPI + CAPO cluster scalability

/kind documentation

This is not a code issue specifically but we were wondering if there exists any data or guidelines around CAPO cluster scalability. Thinking more specifically around;

Number of workload clusters Number of active nodes (would this be provider specific?) Number of healthchecks other important params?

If ball park numbers don't exist today, do we know what parameters are important to track for CAPI cluster scalability and what behaviours we might expect to see as we start to stretch the capabilities?

And finally are there any guideline for scaling a CAPI cluster in terms of which resources can be scaled up/out?

Note that this issue is directly related to https://github.com/kubernetes-sigs/cluster-api/issues/7308 raised in the CAPI project however for our needs we are specifically interested in CAPI + CAPO.

Sep 29 '22 11:09 cunningr

this seems to be a user/operator with developer question, as dev perspective this might be a tough question due to no opreation and no hardware resource .. not sure @seanschneeweiss whether you have any insight?

Sep 30 '22 01:09 jichenjc

I remember I watched this https://www.youtube.com/watch?v=KzYV-fJ_wH0 and very good sharing @seanschneeweiss :)

Sep 30 '22 06:09 jichenjc

We reached > 350 clusters with > 1990 machines in one of our OpenStack regions. This is a number where we have to start analyzing as our provisioning takes longer than usual with unknown waiting times in between two controllers. I'll provide some information as soon as we have narrowed down the problem we are currently facing. However, CAPO doesn't seem to be the bottleneck and I think that the OpenStack API might get a problem soon but not the controller itself. Currently we are using the following concurrency values

 - --openstackcluster-concurrency=10
 - --openstackmachine-concurrency=20

Memory limit/request is at 500Mi but actual usage at 140Mi. It can spike on pod restart. CPU request is at 500Mi but actual usage at 160Mi. Of course during updates (new machines) this can increase.

_{Sean Schneeweiss [email protected], Mercedes-Benz Tech Innovation GmbH, Provider Information}

Nov 12 '22 14:11 seanschneeweiss

those are some really cool numbers. would love to totally chat about how openstack is slowing down and profiling that part!

Nov 14 '22 22:11 mnaser

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 23 '23 08:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Mar 25 '23 09:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Apr 24 '23 09:04 k8s-triage-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@k8s-triage-robot: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 24 '23 09:04 k8s-ci-robot

@mnaser

It is not OpenStack that seems to be slowing down. It is probably related to slowness of the operators - not sure yet.

Apr 24 '23 10:04 seanschneeweiss

/remove-lifecycle stale

don't know what's gonna be done in this issue, let's keep open and if no activity later on ,let's close this

Apr 25 '23 00:04 jichenjc

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

May 25 '23 01:05 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

May 25 '23 01:05 k8s-ci-robot

cluster-api-provider-openstack cluster-api-provider-openstack copied to clipboard

Document guidelines around CAPI + CAPO cluster scalability

cluster-api-provider-openstack
cluster-api-provider-openstack copied to clipboard