keda
keda copied to clipboard
Respect to cooldownPeriod for the first deployment and let service is up and and running based on replica number for the first time.
Proposal
Hey From my understanding based on the current documentation, the cooldownPeriod in KEDA only takes effect after a scaling trigger has occurred. When initially deploying a Deployment, StatefulSet, KEDA immediately scales it to minReplicaCount, regardless of the cooldownPeriod.
It would be incredibly beneficial if the cooldownPeriod could also apply when scaling resources for the first time. Specifically, this would mean that upon deployment, the resource scales based on the defined replicas in the Deployment or StatefulSet and respects the cooldownPeriod before any subsequent scaling operations.
Use-Case
This enhancement would provide teams with a more predictable deployment behavior, especially during CI/CD processes. Ensuring that a new version of a service is stable upon deployment is critical, and this change would give teams more confidence during releases.
Is this a feature you are interested in implementing yourself?
No
Anything else?
No response
Hello,
During CD process KEDA doesn't modify the workload. I mean, IIRC you are right about the 1st time deployment and KEDA doesn't take into account it (for scaling to 0, never for scaling to minReplicaCount
)
Do you see this behaviour on every CD? I mean, does this happen every time when you deploy your workload? Are your workload scaled 0 or to minReplicaCount
? Could you share an example of your ScaledObject and also an example of your workload?
Hello, During CD process KEDA doesn't modify the workload. I mean, IIRC you are right about the 1st time deployment and KEDA doesn't take into account it (for scaling to 0, never for scaling to
minReplicaCount
)Do you see this behaviour on every CD? I mean, does this happen every time when you deploy your workload? Are your workload scaled 0 or to
minReplicaCount
? Could you share an example of your ScaledObject and also an example of your workload?
Hey ,
This is my configuration for keda , minimum replica is set on 0.
spec:
advanced:
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
policies:
- periodSeconds: 300
type: Pods
value: 1
stabilizationWindowSeconds: 1800
scaleUp:
policies:
- periodSeconds: 300
type: Percent
value: 100
stabilizationWindowSeconds: 0
restoreToOriginalReplicaCount: true
cooldownPeriod: 1800
fallback:
failureThreshold: 3
replicas: 1
maxReplicaCount: 10
minReplicaCount: 0
pollingInterval: 20
scaleTargetRef:
name: test
triggers:
- authenticationRef:
name: test
metadata:
mode: QueueLength
protocol: amqp
queueName: test
value: "150"
type: rabbitmq
In other side, the replica count of deployment service is set on 1 . the deployment also has liveness and readiness probes and most of times service needs to have 3 minutes to be up and ready .
This is the command that our CD is running each time when deploying the service.
helm upgrade test ./ --install -f value.yaml -n test --set 'image.tag=test_6.0.0' --atomic --timeout 1200s
When using Helm with the --atomic flag, Helm expects the service to be up and their readiness/liveness probes to pass to mark the deployment as successful. However, with KEDA set to a minReplica of 0, our service is immediately scaled down to zero replicas, even before triggers are recognized.
This behavior leads Helm to assume the deployment was successful, while it's not necessery true. actually service was not up and running for even 20 seconds , it was killed by keda because the minimum replica is set on 0 .
I believe, respect to cooldownPeriod and use the replica count of deployment when deploying service can be beneficial in this cases .
For the moment I have to set the minimum replica to 1 to fix this issue.
In other side, the replica count of deployment service is set on 1 . the deployment also has liveness and readiness probes and most of times service needs to have 3 minutes to be up and ready .
Do you mean that your helm chart always set replicas: 1? Don't you have any condition to skip this setting? Deployment manifest is idempotent, I mean, whatever you set there will be applied at least during a few seconds, if you set 1, your workload will scale to 1 until the next HPA Controller cycle.
As I said, the first time when you deploy an ScaledObject this could happen, but not in the next times, and the reason behind this behavior on next times can be that you are explicitly setting replicas
in deployment manifest.
In other side, the replica count of deployment service is set on 1 . the deployment also has liveness and readiness probes and most of times service needs to have 3 minutes to be up and ready .
Do you mean that your helm chart always set replicas: 1? Don't you have any condition to skip this setting? Deployment manifest is idempotent, I mean, whatever you set there will be applied at least during a few seconds, if you set 1, your workload will scale to 1 until the next HPA Controller cycle.
As I said, the first time when you deploy an ScaledObject this could happen, but not in the next times, and the reason behind this behavior on next times can be that you are explicitly setting
replicas
in deployment manifest.
Yes , replica is set on 1 in the deployment of service.
I even increase initialDelaySeconds to 300 for liveness and readiness, so normally when I set minimum replica of keda to 1 , helm will wait for 300 seconds to get confirmation that service is up and running .
when I set the minimum replica to 0 , after 5 seconds, the service is shutdown by keda and helm said that service is deployed successfully while it's not right!
kubectl describe ScaledObject -n test
Normal KEDAScalersStarted 5s keda-operator Started scalers watch
Normal ScaledObjectReady 5s keda-operator ScaledObject is ready for scaling
Normal KEDAScaleTargetDeactivated 5s keda-operator Deactivated apps/v1.Deployment test/test from 1 to 0
And please consider the ScaledObject is applied by helm alongside other things like deployment ingress and service.
moreover, we do use keda in our stages envs that is not under load most of times . so most of times there is no queue in the message and it's 0 . so replica is set on 0 and it's fine! the issue is raised when we deploy a new versions , how can we make sure the service is working well and is not crashing when it will be shutdown by keda.
As a result , it would be great keda use the replica of deployment as a base each time.
keda needs to use set replica to 1
replicas: 1
maxReplicaCount: 10
minReplicaCount: 0
keda still should set replica 1 when deploy service! replicas: 1 maxReplicaCount: 10 minReplicaCount: 5
in this case, keda can set replica 5 when re-deploy services. replicas: 5 maxReplicaCount: 10 minReplicaCount: 5
I can even set an annotation based on time for ScaledObject ( by help of helm ) so ScaledObject will be updated after each deploy.
I guess that we could implement some initialDelay
or something so, but I'm still not sure why this happens after the 1st deployment. The 1st time it can happen, but after that I thought that it shouldn't.
Am I missing any important point @zroubalik ?
Yeah, this is something we can add. Right now KEDA imidiatelly scales to minReplicas, if there's no load.
+1.
We have exactly the same requirement. KEDA should have an initialDelay
before starting to make scaling decisions. This is very helpful when you deploy something and need it immediately available. Then KEDA should scale things to idle/minimum if not used.
Imagine a deployment with prometheus as trigger (or any other pull trigger). The deployment is immediately scaled to zero and only after pull interval it will be available again..
Proposal
Hey From my understanding based on the current documentation, the cooldownPeriod in KEDA only takes effect after a scaling trigger has occurred. When initially deploying a Deployment, StatefulSet, KEDA immediately scales it to minReplicaCount, regardless of the cooldownPeriod.
It would be incredibly beneficial if the cooldownPeriod could also apply when scaling resources for the first time. Specifically, this would mean that upon deployment, the resource scales based on the defined replicas in the Deployment or StatefulSet and respects the cooldownPeriod before any subsequent scaling operations.
Use-Case
This enhancement would provide teams with a more predictable deployment behavior, especially during CI/CD processes. Ensuring that a new version of a service is stable upon deployment is critical, and this change would give teams more confidence during releases.
Is this a feature you are interested in implementing yourself?
No
Anything else?
No response
I agree that implementing a cooldown period for initial scaling in KEDA is extremely beneficial, especially when using KEDA for serverless architectures. It's crucial to have a cooldown period after the first deployment, before allowing the system to scale down to zero. This cooldown would provide a stabilization phase for the system, ensuring that the service runs smoothly post-deployment before scaling down. Such a design not only enhances post-deployment stability but also aids in assessing the deployment's effectiveness before the service scales down to zero. This cooldown period is particularly important for ensuring smooth and predictable scaling behavior in serverless environments.
Maybe we can easily fix this just honoring cooldownPeriod also in this case. I think that we check if lastActive has value, but we could just assign a default value. WDYT @kedacore/keda-core-contributors ?
This is implementable, but probably as a new setting, to not break existing behavior?
The bug has become into a feature? xD Yep, we can use a new field for it
The bug has become into a feature? xD Yep, we can use a new field for it
Well, it is there since the beginning 🤷♂️ 😄 I am open to discussion.
The workaround we have in place right now since we deploy scaled objects with an operator, is to not add the idleReplicas while the scaled object creationTimestamp is less than then cooldownPeriod. After we set the idleReplicas to zero.
I have a problem. When I use scaledobject to operate deployment, the cooling value configured when creating is normal. When updated, the modified cooling value will no longer take effect. The minimum value of the copy will not take effect when it is set to one, but it will work when it is zero. #5321
@JorTurFer Does cooldownPeriod only take effect when minReplicaCount is equal to 0?
Yes, it only works when minReplicaCount or idleReplicaCount are zero
@JorTurFer Hello, any plan for this fix? thanks.
I'm not sure if there is consensus about how to fix it. @zroubalik , is a new field like initialDelay
the way to go?
Once there is a solution agreed, anyone who is willing to contribute can help with the fix
Yeah a new field, maybe initialCooldownPeriod
, to be consistent in naming?
And maybe put it into advanced
section? Not sure.
support the proposal
Is there any progress here? really need this feature :) thanks.
The feature is almost ready, small changes are pending (but KubeCon has been in the middle)