serving Autoscaler scales app to less than activation-scale

What version of Knative?

1.10.0

Expected Behavior

Services with activation-scale of "2" and initial-scale of "2" should always have at least two replicas running when not scaled to zero. e.g. when deployed the service will initially go to two replicas then if it receives no traffic for the scale to zero period it will go from 2 replicas to zero.

I'm not sure if this is strictly a bug or my misunderstanding of the expected outcome based on this description: https://github.com/knative/serving/blob/bb043fa122fcbf57ae083c31948d166a7e2a7f1d/pkg/apis/autoscaling/register.go#L220-L225

ActivationScale is the minimum, non-zero value that a service should scale to.

Actual Behavior

Initially the service will correctly scale up to 2 replicas but if it does not receive traffic for the scale down period it will scale down to one replica before finally scaling to zero replicas.

If the service receives traffic (even one request) when one replica is running it will scale back up to two replicas.

Steps to Reproduce the Problem

Create a service with the following annotations:

    autoscaling.knative.dev/activation-scale: "2"
    autoscaling.knative.dev/initial-scale: "2"

Allow the service to scale down (by receiving no traffic)

May 26 '23 13:05 diarmuidie

/triage accepted

What I observed:

When creating the new Knative service

The number of initial pods was correct (matches initial-scale)
The autoscaler immediately scaled down to one pod (incorrect)
It eventually scaled zero

When sending traffic

The correct number of pods scaled up (matches activation-scale)
After 60s (stable window?) it scaled down to one pod (incorrect)
After another 30s it terminated the last pod

cc @psschwei in case you have input

May 26 '23 20:05 dprotaso

/assign @xtreme-vikram-yadav

May 26 '23 20:05 xtreme-vikram-yadav

+1 I have seen this too when trying that feature. I could not reproduce the expected behavior (but was not sure if I was missing anything).

May 29 '23 11:05 skonto

Does the scale down from 2 --> 1 --> 0 happen every time, or just the first time? If it's just the first time, then the issue might be with initial scale.

From the release notes: Note that the initial target scale for a revision is still handled by initial-scale; activation-scale will only apply on subsequent scales from zero.

(I think that got written after the code merged, but probably not a bad idea to update the comment to include that info)

It could also be that activation-scale only applies when scaling up... it's been a while and most of the discussion of this was on the old slack instance. That seems to sound right, based on what I remember, but again, it's been a while...

May 30 '23 13:05 psschwei

Does the scale down from 2 --> 1 --> 0 happen every time, or just the first time?

Confirmed that it happens every time (when the revision isn't getting traffic), not just the first time.

From the release notes: Note that the initial target scale for a revision is still handled by initial-scale; activation-scale will only apply on subsequent scales from zero.

In my test case I have both initial-scale and activation-scale are set to 2.

May 31 '23 08:05 diarmuidie

Looking at the docs for activation-scale (link), it seems that the intention was to cover just the scale up case (in other words, an initial scale that would be applied when scaling from zero).

For initial scale, we have the following in the docs: " After the Revision has reached this scale one time, this value is ignored. This means that the Revision will scale down after the initial target scale is reached if the actual traffic received only needs a smaller scale." It might make sense to add that to the activation scale section as well.

All that said, I'm in no way opposed to changing activation scale to expected behavior raised in this issue if that's what folks would prefer.

May 31 '23 11:05 psschwei

I think documenting the current behaviour of activation-scale should suffice. Leaving the scale down decision to autoscaler makes sense unless there is a specific use case for support scaling down to zero by the value of activation-scale.

@diarmuidie are you currrently affected by this or would improved documentation suffice?

Jun 20 '23 16:06 xtreme-vikram-yadav

@diarmuidie following up are you affected by the current behaviour?

We've done the documentation but it does not mean we shouldn't change it to be more 'intuitive' obvious.

Nov 21 '23 19:11 dprotaso

Sorry about the delay, I somehow missed your comments. LGTM. Thanks for the doc update @dprotaso and @xtreme-vikram-yadav 👍

Apr 22 '24 14:04 diarmuidie

serving serving copied to clipboard

Autoscaler scales app to less than activation-scale

What version of Knative?

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

serving
serving copied to clipboard