serving
serving copied to clipboard
Autoscaler scales app to less than activation-scale
What version of Knative?
1.10.0
Expected Behavior
Services with activation-scale of "2" and initial-scale of "2" should always have at least two replicas running when not scaled to zero. e.g. when deployed the service will initially go to two replicas then if it receives no traffic for the scale to zero period it will go from 2 replicas to zero.
I'm not sure if this is strictly a bug or my misunderstanding of the expected outcome based on this description: https://github.com/knative/serving/blob/bb043fa122fcbf57ae083c31948d166a7e2a7f1d/pkg/apis/autoscaling/register.go#L220-L225
ActivationScale is the minimum, non-zero value that a service should scale to.
Actual Behavior
Initially the service will correctly scale up to 2 replicas but if it does not receive traffic for the scale down period it will scale down to one replica before finally scaling to zero replicas.
If the service receives traffic (even one request) when one replica is running it will scale back up to two replicas.
Steps to Reproduce the Problem
Create a service with the following annotations:
autoscaling.knative.dev/activation-scale: "2"
autoscaling.knative.dev/initial-scale: "2"
Allow the service to scale down (by receiving no traffic)
/triage accepted
What I observed:
When creating the new Knative service
- The number of initial pods was correct (matches initial-scale)
- The autoscaler immediately scaled down to one pod (incorrect)
- It eventually scaled zero
When sending traffic
- The correct number of pods scaled up (matches activation-scale)
- After 60s (stable window?) it scaled down to one pod (incorrect)
- After another 30s it terminated the last pod
cc @psschwei in case you have input
/assign @xtreme-vikram-yadav
+1 I have seen this too when trying that feature. I could not reproduce the expected behavior (but was not sure if I was missing anything).
Does the scale down from 2 --> 1 --> 0 happen every time, or just the first time? If it's just the first time, then the issue might be with initial scale.
From the release notes: Note that the initial target scale for a revision is still handled by initial-scale; activation-scale will only apply on subsequent scales from zero.
(I think that got written after the code merged, but probably not a bad idea to update the comment to include that info)
It could also be that activation-scale only applies when scaling up... it's been a while and most of the discussion of this was on the old slack instance. That seems to sound right, based on what I remember, but again, it's been a while...
Does the scale down from 2 --> 1 --> 0 happen every time, or just the first time?
Confirmed that it happens every time (when the revision isn't getting traffic), not just the first time.
From the release notes: Note that the initial target scale for a revision is still handled by initial-scale; activation-scale will only apply on subsequent scales from zero.
In my test case I have both initial-scale and activation-scale are set to 2.
Looking at the docs for activation-scale (link), it seems that the intention was to cover just the scale up case (in other words, an initial scale that would be applied when scaling from zero).
For initial scale, we have the following in the docs: " After the Revision has reached this scale one time, this value is ignored. This means that the Revision will scale down after the initial target scale is reached if the actual traffic received only needs a smaller scale." It might make sense to add that to the activation scale section as well.
All that said, I'm in no way opposed to changing activation scale to expected behavior raised in this issue if that's what folks would prefer.
I think documenting the current behaviour of activation-scale should suffice. Leaving the scale down decision to autoscaler makes sense unless there is a specific use case for support scaling down to zero by the value of activation-scale.
@diarmuidie are you currrently affected by this or would improved documentation suffice?
@diarmuidie following up are you affected by the current behaviour?
We've done the documentation but it does not mean we shouldn't change it to be more 'intuitive' obvious.
Sorry about the delay, I somehow missed your comments. LGTM. Thanks for the doc update @dprotaso and @xtreme-vikram-yadav 👍