serving icon indicating copy to clipboard operation
serving copied to clipboard

scale-to-zero-pod-retention-period max value not documented

Open bjornrydahl opened this issue 2 years ago • 9 comments

Ask your question here:

I'm trying to set the scale-to-zero-pod-retention-period documented here: https://knative.dev/docs/serving/autoscaling/scale-to-zero/#scale-to-zero-grace-period to something higher than 1h. It appears that there is a max value of 1h, but in the docs it says "Possible values: Non-negative duration string"

admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: expected 0s <= 2h <= 1h0m0s: spec.template.metadata.annotations.autoscaling.knative.dev/scale-to-zero-pod-retention-period

I would like to set something higher than 1h, and would like to know if there is a reason why max is 1h or if there is a workaround I can use.

Thank you!

bjornrydahl avatar Feb 22 '23 13:02 bjornrydahl

I don't have the answer to your question, maybe @dprotaso or @psschwei have an idea?

ReToCode avatar Mar 08 '23 12:03 ReToCode

Looks like the 1h is hard coded and we bound some of these annotations to this max window. https://github.com/knative/serving/blob/f031fd4e16e23c404ba2531b71f23d289a74889a/pkg/apis/autoscaling/register.go#L144-L146

Oddly though - the bound only applies when it's an annotation on the revision - but not the global setting in the config map.

I'm not sure what's right and wrong here - so this requires a deeper dive

/triage accepted

dprotaso avatar Mar 15 '23 16:03 dprotaso

Paul found the original feature track document and it mentions

There is no initial maximum possible value for the flag, but in the end we might want to clamp it, say with one hour limit (as we do for stable window flag).

dprotaso avatar Mar 15 '23 17:03 dprotaso

So @bjornrydahl you sorta have a work around by setting a global value for all revisions.

Otherwise we can relax the max duration constraint but I'm wondering what durations you've tried to use in the past?

dprotaso avatar Mar 15 '23 17:03 dprotaso

@dprotaso Thank you for the input here, and I apologize for not getting back to this earlier. I will try using the config instead.

We intended to have it longer to see if we could scale to 0 after 24h.

If that is a bad idea, feel free to let me know.

bjornrydahl avatar May 26 '23 05:05 bjornrydahl

Curious - do you prefer to have this as an operator (global) setting for all revisions or is this something you prefer to customize on a per revision basis?

dprotaso avatar May 29 '23 19:05 dprotaso

@dprotaso @psschwei In our cluster deployments I have the same requirement that @bjornrydahl has/had. For some workloads (models) I need to specify a scale-to-zero-pod-retention-period of greater than 1 hour.
Since this issue is still open am I right to assume:

  • I don't have the ability to configure a duration greater than an hour at the revision level
  • But I can globally set a larger value within the config map, but it will apply to all workloads (that do not have a value specified in the CR)

UPDATE: After doing some testing, my assumptions were correct :-)

Thanks for the guidance.

desimonemike123 avatar Oct 31 '24 19:10 desimonemike123

We also have the use case with several nightly batch jobs taking 60-90 minutes. We'd love for them to wake up, run to completion, then go off again.

For the vast majority of our Knative functions though, we want them to scale down after 30 seconds with no requests.

For now we can just set them to never scale to zero and always maintain 1 pod. But obviously it's not ideal!

eupharis avatar Jun 16 '25 19:06 eupharis

we want them to scale down after 30 seconds with no requests.

Hey @eupharis - have you seen the documentation here: https://knative.dev/docs/serving/autoscaling/scale-to-zero/

Based on your comment you should be covered.

dprotaso avatar Jun 16 '25 20:06 dprotaso