hpa reconcile keep generating the new recommendation
🐛 Describe the bug
-
When we update the hpa configuration, it will enqueue the objects and immediately generate the new recommendations.
-
Controller normally updates the CR object and trigger the new reconcilation loop
We should use default freeze window to skip frequent updates.
I0930 22:50:10.750098 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:10.750123 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[8,21,36,51,44,10,56.99999999999999,30,19,14.000000000000002]
I0930 22:50:10.750137 1 autoscaler.go:298] "Metrics aggregated" currentValue=43.68571428571429 trend=0 confidence=0 podCount=10
I0930 22:50:10.750143 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:10.750156 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":8,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":43.68571428571429,"trend":0}}
I0930 22:50:10.759901 1 podautoscaler_controller.go:523] "Successfully rescaled" PodAutoscaler="default/podautoscaler-mock-llama2-7b" currentReplicas=10 desiredReplicas=8 reason="All metrics below target"
I0930 22:50:10.983530 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:10.983570 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[12,56.99999999999999,72,71,72,54,33,35,7.000000000000001,5]
I0930 22:50:10.983597 1 autoscaler.go:298] "Metrics aggregated" currentValue=45.51428571428571 trend=0 confidence=0 podCount=10
I0930 22:50:10.983616 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:10.983658 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":7,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":45.51428571428571,"trend":0}}
E0930 22:50:10.997355 1 controller.go:316] "msg"="Reconciler error" "error"="failed to apply scaling for Deployment/default/mock-llama2-7b: Operation cannot be fulfilled on deployments.apps \"mock-llama2-7b\": the object has been modified; please apply your changes to the latest version and try again" "PodAutoscaler"={"name":"podautoscaler-mock-llama2-7b","namespace":"default"} "controller"="podautoscaler" "controllerGroup"="autoscaling.aibrix.ai" "controllerKind"="PodAutoscaler" "name"="podautoscaler-mock-llama2-7b" "namespace"="default" "reconcileID"="34119e05-3292-485c-8a90-4b9e124143f2"
I0930 22:50:11.060594 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:11.060615 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[82,62,20,60,34,31,78,60,5,84]
I0930 22:50:11.060628 1 autoscaler.go:298] "Metrics aggregated" currentValue=46.275000000000006 trend=0 confidence=0 podCount=10
I0930 22:50:11.060637 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:11.060652 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":7,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":46.275000000000006,"trend":0}}
I0930 22:50:11.069311 1 podautoscaler_controller.go:523] "Successfully rescaled" PodAutoscaler="default/podautoscaler-mock-llama2-7b" currentReplicas=8 desiredReplicas=7 reason="All metrics below target"
I0930 22:50:11.139566 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:11.139608 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[91,24,48,28.000000000000004,47,28.000000000000004,80,71,84,43]
I0930 22:50:11.139631 1 autoscaler.go:298] "Metrics aggregated" currentValue=46.62499999999999 trend=0 confidence=0 podCount=10
I0930 22:50:11.139643 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:11.139670 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":6,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":46.62499999999999,"trend":0}}
E0930 22:50:11.150610 1 controller.go:316] "msg"="Reconciler error" "error"="failed to apply scaling for Deployment/default/mock-llama2-7b: Operation cannot be fulfilled on deployments.apps \"mock-llama2-7b\": the object has been modified; please apply your changes to the latest version and try again" "PodAutoscaler"={"name":"podautoscaler-mock-llama2-7b","namespace":"default"} "controller"="podautoscaler" "controllerGroup"="autoscaling.aibrix.ai" "controllerKind"="PodAutoscaler" "name"="podautoscaler-mock-llama2-7b" "namespace"="default" "reconcileID"="8b5a5f2b-7586-46c1-91e1-896bc81b29d4"
I0930 22:50:11.197372 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:11.197393 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[26,35,59,71,17,70,98,80,68,17]
I0930 22:50:11.197405 1 autoscaler.go:298] "Metrics aggregated" currentValue=46.5875 trend=0 confidence=0 podCount=10
I0930 22:50:11.197414 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:11.197427 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":6,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":46.5875,"trend":0}}
I0930 22:50:11.205956 1 podautoscaler_controller.go:523] "Successfully rescaled" PodAutoscaler="default/podautoscaler-mock-llama2-7b" currentReplicas=7 desiredReplicas=6 reason="All metrics below target"
I0930 22:50:11.254147 1 autoscaler.go:269] "Collecting metrics" source="default/podautoscaler-mock-llama2-7b" total pods=10 metrics available pods=10
I0930 22:50:11.254169 1 autoscaler.go:273] "Processing metrics snapshot" source="default/podautoscaler-mock-llama2-7b" values=[68,76,12,47,53,30,56.00000000000001,42,5,68]
I0930 22:50:11.254181 1 autoscaler.go:298] "Metrics aggregated" currentValue=45.5375 trend=0 confidence=0 podCount=10
I0930 22:50:11.254191 1 autoscaler.go:320] "Computing scaling recommendation" source="default/podautoscaler-mock-llama2-7b" algorithm="apa"
I0930 22:50:11.254205 1 autoscaler.go:326] "Scaling recommendation computed" source="default/podautoscaler-mock-llama2-7b" algorithm="apa" recommendation={"DesiredReplicas":5,"Confidence":0,"Reason":"apa scaling based on current metrics","Algorithm":"apa","ScaleValid":true,"Metadata":{"current_value":45.5375,"trend":0}}
Steps to Reproduce
update hpa configuration
Expected behavior
It should be stabilized
Environment
nightly
/assign
will take a look this one
@Jeffwan There is a simple way(just like hpa resource in k8s):
Add a stabilizationWindowSeconds field to PodAutoscalerSpec (default: 300s):
// StabilizationWindowSeconds is the number of seconds the autoscaler should wait
// before scaling(including up and down) again after a successful scale operation.
// This prevents rapid fluctuations in replica count.
// Defaults to 300 (5 minutes) if not specified.
// +optional
// +kubebuilder:default=300
// +kubebuilder:validation:Minimum=0
// +kubebuilder:validation:Maximum=3600
StabilizationWindowSeconds *int32 `json:"stabilizationWindowSeconds,omitempty"`
In the reconcile loop, before calling executeScalingPipeline, check if we’re within the stabilization window after the last successful scale:
or we can use like this to set window cool time for use:
behavior:
scaleUp:
stabilizationWindowSeconds: 60
scaleDown:
stabilizationWindowSeconds: 600
apiVersion: autoscaling.aibrix.ai/v1alpha1
kind: PodAutoscaler
metadata:
name: ss-pool-decode
namespace: default
annotations:
autoscaling.aibrix.ai/storm-service-mode: "pool"
spec:
scaleTargetRef:
apiVersion: orchestration.aibrix.ai/v1alpha1
kind: StormService
name: ss-pool
# Select the decode role within the StormService
subTargetSelector:
roleName: decode
minReplicas: 3
maxReplicas: 30
scalingStrategy: APA
metricsSources:
- metricSourceType: pod
protocolType: http
port: "8000"
path: /metrics
targetMetric: "decode_batch_utilization"
targetValue: "70"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
scaleDown:
stabilizationWindowSeconds: 600