es-operator
es-operator copied to clipboard
operator stuck in scale down loop
Expected Behavior
When CPU load is below scaleDownCPUBoundary then replica count should reduce. Thus node count should go down
Actual Behavior
When CPU load is below scaleDownCPUBoundary, index replica count is not reduced. Thus number of nodes does not go down. Logs - time="2024-03-19T06:27:48Z" level=info msg="Waiting for operation to stop" eds=es-mci-data namespace=mci time="2024-03-19T06:27:49Z" level=info msg="Terminating operator loop." eds=es-mci-data namespace=mci time="2024-03-19T06:27:50Z" level=info msg="Waiting for operation to stop" eds=es-mci-data namespace=mci time="2024-03-19T06:27:50Z" level=error msg="Failed to operate resource: failed to update status: Put "https://10.10.0.1:443/apis/zalando.org/v1/namespaces/mci/elasticsearchdatasets/es-mci-data/status?timeout=30s": context canceled" time="2024-03-19T06:27:50Z" level=info msg="Terminating operator loop." eds=es-mci-data namespace=mci time="2024-03-19T06:28:19Z" level=info msg="Scaling hint: DOWN" eds=es-mci-data namespace=mci time="2024-03-19T06:28:49Z" level=info msg="Scaling hint: DOWN" eds=es-mci-data namespace=mci time="2024-03-19T06:29:19Z" level=info msg="Scaling hint: DOWN" eds=es-mci-data namespace=mci
Steps to Reproduce the Problem
-
I have simple setup with 1 ES cluster with 1 master and 1 EDS managed by es-operator. I have single index with 2 shard.
-
scaling options - enabled: true minReplicas: 1 maxReplicas: 6 minShardsPerNode: 1 maxShardsPerNode: 1 minIndexReplicas: 0 maxIndexReplicas: 5 scaleUpCPUBoundary: 50 scaleUpCooldownSeconds: 60 scaleUpThresholdDurationSeconds: 30 scaleDownCPUBoundary: 40 scaleDownCooldownSeconds: 60 scaleDownThresholdDurationSeconds: 30 diskUsagePercentScaledownWatermark: 75
-
When i start basic busybox load generator , the cpu usage increases and es-operator scales up by increasing replica count of index. But when i stop load generator , cpu usage goes down but replica count is not updated. Thus number of nodes remained high
Specifications
- Version: ES 8.12.2, es-operator: latest(should be 0.1.4)
- Platform: Gcloud k8s cluster