es-operator icon indicating copy to clipboard operation
es-operator copied to clipboard

Integrate scale sub resouce by means of an HPA and ScheludedScaling CRD

Open girishc13 opened this issue 1 year ago • 4 comments

One-line summary

Integrate Horizontal Pod Autoscaler to allow scaling EDS using ScalingSchedule custom resource.

Description

This pr development focusses on using the a scaling schedule or a cluster scaling schedule CRD's for automatically scaling up an EDS to the required number of replicas. The HPA doesn't directly control or edit the replicas of the EDS. The replica count reported by the HPA is stored in a separate property which is taken into account during the calculation for determining the scaling operation. The HPA targets the EDS for registering the target scaling counts.

A new hpa_replicas property is introduced to the EDS spec. This property is controlled by the HPA via the scale subresource. There is also a corresponding status property. By default the HPA will not set the number of replicas to zero. Therefore the default value of hpa_replicas is 1. The HPA scaling pattern respects the existing cool down periods.

The initial scalingHint() method checks for the .spec.hpa_replicas property in addition to the cpu metrics to determine the next scaling operation.

The subsequent scaleUpOrDown has the core logic for incorporating the hpa replica count to calculate the next change in index or node replicas.

  • There will be no scaling operation if scaling is disabled.
  • There will be no scaling operation if there are no indices to manage.
  • MinNodeReplica condition is satisfied first. The hpa replica count is satisfied in the next scaling step.
  • MaxShardsPerNode is satisfied before satisfying hpa replica count. The hpa replica count maybe satisfied in the next scaling step.
  • MinIndexReplicas is satisfied before satisfying hpa replica count. The hpa replica count is satisfied in the next scaling step.
  • For scaling hint UP:
    • Preserve the shard to node ratio by increasing index replicas only after the hpa replica count (> 1) has been satisfied. This step prevents the need or desire to maintain the same shard to node ratio until reaching the hpa replica count. This also might add a downside of skewed shard to node ratio when the hpa replica count is satisfied. This step can be revisited to preserve equal shard to node ratio or the MinShardsPerNode condition.
    • As before calculate the new desired node replicas by reducing the shard to node ratio 1.
    • If hpa replica count is higher than the new node replica count in the previous step then scale up directly to hpa replica count. This step can lead to skew in shard to node ratio. This step can be revisited to incorporate smaller node replica scaling steps.
    • The last step is to scale up to newly calculated node replicas calculated before.
  • For scaling hint DOWN:
    • Scale down index replicas if index replicas is > MinIndexReplica setting. Re-calculate the number of node replicas based on the new shard to node ratio.

A custom metrics adapter like kube-metrics-adapter is required to support the custom ScheduledScaling CRD. The custom metrics server is responsible for collecting replica counts or scaling values based on the CRD.

Types of Changes

  • New feature (non-breaking change which adds functionality)
  • Refactor/improvements
  • Documentation / non-code

Tasks

List of tasks you will do to complete the PR

  • [x] Improvements to scale up in steps instead of a single step scale up to satisfy hpa replica count.
  • [x] Add tests for EDS status updates.
  • [x] Update e2e tests.
  • [x] Update the getting started section.
  • [x] Add debug notes or links to test the hpa and the scale sub resource.

Review

List of tasks the reviewer must do to review the PR

  • [ ] Tests
  • [ ] Documentation
  • [ ] CHANGELOG

Deployment Notes

  • Verify the custom metrics server RBAC setup before testing the scheduled scaling. Link to the debugging section.

girishc13 avatar Jul 25 '22 13:07 girishc13

@mikkeloscar Can you paste here the snippet that you prepared for creating the fake clientset. I cannot access our chat message anymore. I have one pending unit test to complete this pr.

girishc13 avatar Sep 06 '22 12:09 girishc13

@otrosien I've complete the last task from my side. The e2e and documentation tests are taking longer than before which cause a timeout on the workflow step. I can try and increase the CPU resources in the documentation manifests but I'm not sure if it will help.

girishc13 avatar Sep 12 '22 09:09 girishc13

Thanks @girishc13, I'm still on a few other topics and will pick it up later this month.

otrosien avatar Sep 19 '22 19:09 otrosien

@girishc13 sorry I missed your message. Here it is in case you didn't figure it out yet:

package clientset

import (
    mFake "github.com/zalando-incubator/es-operator/pkg/client/clientset/versioned/fake"
    "github.com/zalando-incubator/es-operator/pkg/clientset"
    "k8s.io/client-go/kubernetes/fake"
    zFake "k8s.io/metrics/pkg/client/clientset/versioned/fake"
)

func NewFakeClientset() *clientset.Clientset {
    return clientset.Clientset{
        Interface:  fake.NewSimpleClientset(),
        zInterface: zFake.NewSimpleClientset(),
        mInterface: mFake.NewSimpleClientset(),
    }
}

Didn't have time to check the PR yet :(

mikkeloscar avatar Sep 20 '22 07:09 mikkeloscar