scale-to-zero-pod-retention-period max value not documented
Ask your question here:
I'm trying to set the scale-to-zero-pod-retention-period documented here: https://knative.dev/docs/serving/autoscaling/scale-to-zero/#scale-to-zero-grace-period to something higher than 1h. It appears that there is a max value of 1h, but in the docs it says "Possible values: Non-negative duration string"
admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: expected 0s <= 2h <= 1h0m0s: spec.template.metadata.annotations.autoscaling.knative.dev/scale-to-zero-pod-retention-period
I would like to set something higher than 1h, and would like to know if there is a reason why max is 1h or if there is a workaround I can use.
Thank you!
I don't have the answer to your question, maybe @dprotaso or @psschwei have an idea?
Looks like the 1h is hard coded and we bound some of these annotations to this max window. https://github.com/knative/serving/blob/f031fd4e16e23c404ba2531b71f23d289a74889a/pkg/apis/autoscaling/register.go#L144-L146
Oddly though - the bound only applies when it's an annotation on the revision - but not the global setting in the config map.
I'm not sure what's right and wrong here - so this requires a deeper dive
/triage accepted
Paul found the original feature track document and it mentions
There is no initial maximum possible value for the flag, but in the end we might want to clamp it, say with one hour limit (as we do for stable window flag).
So @bjornrydahl you sorta have a work around by setting a global value for all revisions.
Otherwise we can relax the max duration constraint but I'm wondering what durations you've tried to use in the past?
@dprotaso Thank you for the input here, and I apologize for not getting back to this earlier. I will try using the config instead.
We intended to have it longer to see if we could scale to 0 after 24h.
If that is a bad idea, feel free to let me know.
Curious - do you prefer to have this as an operator (global) setting for all revisions or is this something you prefer to customize on a per revision basis?
@dprotaso @psschwei
In our cluster deployments I have the same requirement that @bjornrydahl has/had. For some workloads (models) I need to specify a scale-to-zero-pod-retention-period of greater than 1 hour.
Since this issue is still open am I right to assume:
- I don't have the ability to configure a duration greater than an hour at the revision level
- But I can globally set a larger value within the config map, but it will apply to all workloads (that do not have a value specified in the CR)
UPDATE: After doing some testing, my assumptions were correct :-)
Thanks for the guidance.
We also have the use case with several nightly batch jobs taking 60-90 minutes. We'd love for them to wake up, run to completion, then go off again.
For the vast majority of our Knative functions though, we want them to scale down after 30 seconds with no requests.
For now we can just set them to never scale to zero and always maintain 1 pod. But obviously it's not ideal!
we want them to scale down after 30 seconds with no requests.
Hey @eupharis - have you seen the documentation here: https://knative.dev/docs/serving/autoscaling/scale-to-zero/
Based on your comment you should be covered.